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METHODS FOR IDENTIFYING COMPOUNDS 
THAT MODULATE UNTRANSLATED REGION-DEPENDENT GENE 
EXPRESSION AND METHODS OF USING SAME 

[0001] This application is entitled to and claims priority benefit to U.S. provisional 
application Serial No. 60/441,637, filed January 21, 2003, which is incorporated herein by 
reference in its entirety. 

1. INTRODUCTION 

[0002] The present invention relates to a method for screening and identifying 
compounds that modulate untranslated region-dependent expression of any gene. In 
particular, the invention proAddes reporter gene-based assays for the identification of 
compounds that modulate untranslated region-dependent expression of a gene. The 
methods of the present invention provide a simple, sensitive assay for high-throughput 
screening of libraries of compounds to identify pharmaceutical leads. 

2. BACKGROUND OF THE INVENTION 

2.1, Gene Expression 
[0003] Every living organism is a product of expression of its genes in response to a 
developmental program (encoded in the genome itself) and environmental factors. Gene 
expression can be defined as the conversion of the nucleotide sequence of a gene kito the 
amino acid sequence of a protein or into the nucleotide sequence of a stable RNA. 
[0004] In eukaryotes, gene expression begins in the nucleus with the transcription of a 
gene into a premessenger-KNA, also referred to as a primary transcript. While still in the 
nucleus, the pre-roRNA is extensively modified. Each primary transcript is capped at the 5 ' 
end, associates with hnRNP proteins to form messenger RNA-protein particles ("mRNPs"), 
acquires a polyadenylic acid taU at the 3' end, and undergoes splicing to remove introns. In 
addition, the nucleotide sequence of certain pre-inRNAs can be altered 
post-transcriptionally in a process known as RNA editing. Thus processed, the mature 
mRNA is exported to the cytoplasm. Upon export, mRNA dissociates fi-om hnRNP 
proteins and binds a set of cytosol-specific mRNA-binding proteins. Once in the 
cytoplasm, the mRNA either immediately associates with ribosomes and templates for 
protein synthesis or is localized to discrete cellular foci to direct compartment-specific 
protein synthesis. Degradation of mRNA and protein, which occurs both in the nucleus and 
the cytoplasm, concludes the list of processes that comprise gene expression. 
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2.2. Post-transcriptional Gene Expression Regulation 
[0005] Gene expression is very tightly regulated. To produce the desired phenotype, 
each gene must be expressed at a defined time and at a defined rate and amount. Extensive 
experimental evidence indicates that post-transcriptional processes such as mRNA decay, 
translation, and mRNA locahzation constitute major control points in gene expression. 
[0006] An aberration in the expression of one or more genes can be the cause or a 
downstream effect of a disease or other abnonnality. Understanding gene expression 
regulation mechanisms in the normal/healthy/wild-type cell/body and during pathology will 
permit rational therapeutic intervention. 

[0007] Regulation of gene expression both at the niRNA stability and translation levels 
is important in cellular responses to development or environmental stimuh such as nutrient 
levels, cytokines, hormones, and temperature shifts, as well as environmental stresses like 
hypoxia, hypocalcemia, viral infection, and tissue injury (reviewed in Guhaniyogi & 
Brewer, 2001, Gene 265(l-2):ll-23). Furthermore, alterations in mRNA stability have 
been causally connected to specific disorders, such as neoplasia, thalassemia, and 
Alzheimer's disease, (reviewed in Guhaniyogi 8l Brewer, 2001, Gene 265(l-2):ll-23 and 
Translational Control of Gene Expression, Sonenberg, Hershey, and Mathews, eds., 2000, 
CSHL Press). In contrast, regulation of gene expression at the mRNA localization level is 
primarily used by the cell to create and maintain polarity (internal gradients of protein 
concentration) (reviewed in Translational Control of Gene Expression, Sonenberg, Hershey, 
and Mathews, eds., 2000, CSHL Press). 

2.3. mRNA Untranslated Regions in Gene Expression Regulation 
[0008] A typical mRNA contains a 5 ' cap, a 5 ' untranslated region ("5 ' UTR") 
upstream of a start codon, an open reading frame, also referred to as coding sequence, that 
encodes a stable RNA or a functional protein, a 3' untranslated region ("3' UTR") 
downstream of the termination codon, and apoly(A) tail. Most studied cis-dependent 
RNA-based gene expression regulation elements map to the 5' or 3' UTRs. 
[00091 Examples of 5' UTR regulatory elements include the iron response element 
("IRE"), internal ribosome entry site ("IRES"), upstream open reading firame ("uORF"), 
male specific lethal element ("MSL-2"), G-quartet element, and 5 '-terminal 
oligopyrimidine tract ("TOP") (reviewed in Keene & Tenenbaum, 2002, Mol Cell 9:1 161 
and Translational Control of Gene Expression, Sonenberg, Hershey, and Mathews, eds., 
2000, CSHL Press). 

[0010] Examples of 3 ' UTR regulatory elements include AU-rich elements ("AREs"), 

Selenocysteine insertion sequence ("SECIS"), histone stem loop, cytoplasmic 

-2- 
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polyadenylation elements ("CPEs"), nanos translational control element, amyloid precursor 
protein element ("APP"), translational regulation element ("TGE")/direct repeat element 
("DRE"), bruno element ("BRE"), 15-lipoxygenase differentiation control element 
(15-LOX-DICE), and G-quartet element (reviewed in Keene & Tenenbaum, 2002, Mol Cell 
9:1161). 

[001 1] The internal ribosome entry site ("IRES") is one of the 5 ' UTR-based cz^-acting 
elements of post-transcriptional gene expression control. IRESes facilitate cap-independent 
translation initiation by recruiting ribosomes directly to the mRNA start codon. IRESes are 
commonly located in the 3' region of a 5' UTR and are, as recent work has estabhshed, 
frequently composed of several discrete sequences. IRESes do not share significant 
primary structure homology, but do form distinct RNA secondaiy and tertiary structures. 
Some IRESes contain sequences complementary to 18S RNA and therefore may form 
stable complexes with the 40S ribosomal subunit and initiate assembly of translationally 
competent complexes. A classic example of an "RNA-only" IRES is the internal ribosome 
entry site from Hepatitis C virus. However, most known IRESes require protein co-factors 
for activity. More than 10 IRES trans-acting factors ("ITAFs") have been identified so far. 
hi addition, all canonical translation initiation factors, with the sole exception of 5' end cap- 
binding eTF4E, have been shown to participate in IRES-mediated translation initiation 
(reviewed in Vagner et al., 2001, EMBO reports 2:893 and Translational Confrol of Gene 
Expression, Sonenberg, Hershey, and Mathews, eds., 2000, CSHL Press). 
[0012] AU-rich elements ("AREs") are 3' UTR-based regulatory signals. AREs are the 
primary determinant of mRNA stabihty and one of the key determinants of mRNA 
translation initiation efficiency. A typical ARE is 50 to 150 nucleotides long and contains 3 
to 6 copies of AU3A pentamers embebbed in a generally AAJ-enriched RNA region. The 
AU3 A pentamers can be scattered within the region or can stagger or even overlap (see, 
e.g., Chen et al., 1995, Trends Biol Sci 20:465). One or several AU3A pentamers can be 
replaced by expanded versions such as AU4A or AU5A heptamers (see, e.g., Wilkund et al., 
2002, J Biol Chem 277:40462 and Tholanikunnel and Malbom, 1997, J Biol Chem 
272:11471). Single copies of the AUnA (where n = 3, 4, or 5) elements placed in a random 
sequence context are inactive. The minimal active ARE has the sequence 
U2AUnA(U/A)(U/A) (where n = 3, 4, or 5) (see, e.g. Worthington et al., 2002, J Biol Chem, 
277:48558-64). The activity of certain AU-rich elements in promoting mRNA degradation 
is enhanced in the presence of distal uridine-rich sequences. These U-rich elements do not 
affect mRNA stability when present alone and thus that have been termed "ARE enhancers" 
(see, e.g.. Chen et al., 1994, Mol. Cell. Biol. 14:416). 

-3- 
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[0013] Most ARBs function in mRNA decay regulation and translation initiation 
regulation by interacting with specific ARE-binding proteins ("AUBPs"). There are at least 
14 known cellular proteins that bind to AU-rich elements. AUBP functional properties 
determine ARE involvement in one or both pathways. For example, ELAV/HuR binding to 
c-fox ARE inhibits c-fos mRNA decay (see, e.g., Brennaa & Steitz, 2001, Cell Mol Life Sci. 
58:266), association of tristetraprolin with TNEce ARE dramatically enhances TNFamRNA 
hydrolysis (see, e.g., Carballo et al., 1998, Science 281:1001), whereas interaction of TIA-1 
with the TNFo: ARE does not alter the TNFa mRNA stability but inhibits TNFa translation 
(see, e.g., Piecyk et al., 2000, EMBO J. 19:4154). Given its size, it is very likely that one 
copy of a typical ARE is capable of interacting with several AUBPs molecules. Therefore, 
it is contended that in the cell the competition of muhiple AUBPs for the limited set of 
AUBP-binding sites in an ARE and the resulting "ARE proteome" determines the ARE 
regulatory output. 

[0014] The mechanism of ARE-mediated mRNA decay is poorly understood. It has 
been estabhshed that mammahan mRNA degradation proceeds in 3' to 5' direction and that 
the first step is deadenylation by poly(A)-specific ribonuclease ("PARN"). Recent work 
indicates that following deadenylation a stable multi-ribonuclease complex, termed 
exosome, degrades the body of the message. Exosome alone is capable of initiating and 
accompHshing mRNA decay. However, the presence of certain ARBs upregulates 
degradation efficiency. Available evidence suggests that AREs alone or bound by AUBPs 
help recruit exosome to the RNA (see, e.g., Chen et al., 2001, Cell 107:451 and Mukherjee 
et al., 2002, EMBO J. 21:165). 

[0015] It has been reported that degradation of some mRNAs depends on ongoing 
translation. Thus, the translation macliinery can also serve as a ribonulease-recruiting or 
stabihzing AUBPs-removing entity. Supporting evidence indicates that this mechanism 
may operate only on a subset of mRNAs under special cell growth conditions (see, e.g., 
Curatola et al., 1995, Mol. Cell. Biol, 15:6331; Chen et al., 1995, Mol. Cell, Biol. 15:5777; 
Koeller et al., 1991, Proc. Natl. Acad. Sci. 88:7778; Savant-Bhonsale et al., 1992, Genes 
Dev. 6:1927; and Aharon & Schneider, 1993, Mol. Cell. Biol. 13:1971). 
[0016] The mechanism of ARE-dependent translation regulation is understood even less 
well than that of ARE-mediated mRNA decay. It is not clear how a 3' UTR-locahzed 
element can affect translation initiation, a process that takes place in the 5' UTR, One 
plausible explanation comes from recent work showing that most or all cytoplasmic mRNPs 
are circularized via eIF4F - poly(A)-bkiding protein ("PABP") interaction. This interaction 
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can bring AREs in the 3 ' UTR into close proximity to the translation initiation site (see, 
e.g.. Wells et al., 1998, Mol. Cell. 2:135). 

[0017] Citation or identification of any reference in Section 2 of this appHcation is not 
an admission that such reference is available as prior art to the present invention. 

3. SUMMARY OF THE INVENTION 

[0018] The present invention provides methods for identifying a compound that 

modulates untranslated region-dependent expression of a target gene. In particular, the 

invention provides methods for identifying compounds that down-regulate the translation or 

the stability of an mRNA of a target gene that is associated with or has been linked to the 

onset, development, progression or severity of a particular disease or disorder, said 

compoimds functioning, at least in part, by targeting one or more aspects of untranslated 

region-dependent expression of the target gene. The invention also provides methods for 

identifying compounds that upregulate the translation or the stability of an mRNA of a 

target gene whose expression is beneficial to a subject with a particular disease or disorder, 

said compounds functioning, at least, in part, by targeting one or more aspects of 

untranslated region-dependent expression of the target gene. The invention encompasses 

the use of the compounds identified utilizing the methods of the invention for modulating 

the expression of a target gene in vitro and in vivo. In particular, the invention encompasses 

the use of the compounds identified utilizuig the methods of the invention for the 

prevention, treatment or amelioration of a disease or disorder or a symptom thereof 

[0019] The invention provides reporter gene-based assays for the identification of a 

compound that modulates untranslated region-dependent expression of a target gene. The 

reporter gene-based assays may be conducted by contacting a compound with a cell 

genetically engineered to express a nucleic acid comprising a reporter gene operably linked 

to one or more untranslated regions of a target gene, and measuring the expression of said 

reporter gene. Altematively, the reporter gene-based assays may be conducted by 

contacting a compound with a cell-fi:ee translation mixture and a nucleic acid comprising a 

reporter gene operably linked to one or more untranslated regions of a target gene, and 

measuring the expression of said reporter gene. The alteration in reporter gene expression 

relative to a previously determined reference range or a control in such reporter-gene based 

assays indicates that a particular compound modulates untranslated region-dependent 

expression of a target gene. In a specific embodiment, a compoimd identified utilizing a 

reporter gene-based assay described herein alters the expression of the reporter gene by at 

least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, 

at least 60%, at least 65%, at least 70%, at least 75%, at least 80%., at least 85%, at least 
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90%, at least 95%, or at least 99%, or at least 1.5 fold, at least 2 fold, at least 2.5 fold, at 
least 5 fold, at least 7.5 fold or at least 10 fold relative to a control (e.g., PBS), the absence 
of a control or a previously determined reference range in an assay described herein or well- 
known in the art. In order to exclude the possibility that a particular compound is 
functioning solely by modulating the expression of a target gene in an untranslated region- 
independent manner, one or more mutations (i.e., deletions, insertions, or nucleotide 
substitutions) may be introduced into the untranslated regions operably Unked to a reporter 
gene and the effect on the expression of the reporter gene in a reporter gene-based assay 
described herein can be determined. 

[0020] In one embodiment, the invention provides a method for identifying a compound 
that modulates untranslated region-dependent expression of a target gene, said method 
comprising: (a) expressing a nucleic acid comprising a reporter gene operably linked to two, 
three or more untranslated regions of said target gene in a cell; (b) contacting said cell with 
a member of a library of compounds; and (c) detecting the expression of said reporter gene, 
wherein a compound that modulates untranslated region-dependent regulation of expression 
is identified if the expression of said reporter gene in the presence of a compound is altered 
as compared to the expression of said reporter gene in the absence of said compound or the 
presence of a control (e.g., phosphate buffered saline ("PBS")), hi an alternative 
embodiment, the invention provides a method for identifying a compound that modulates 
untranslated region-dependent expression of a target gene, said method comprising: (a) 
expressing a nucleic acid comprising a reporter gene operably linked tvvo, three or more 
untranslated regions of said target gene in a cell; (b) contacting said cell with a member of a 
library of compounds; and (c) detecting the expression of said reporter gene, wherein a 
compound that modulates untranslated region-dependent expression is identified if the 
expression of said reporter gene is altered in the presence of the compound relative to a 
previously detennined reference range. 

[0021] hi another embodiment, the invention provides a method for identifying a 
compound that modulates untranslated region-dependent expression of a target gene, said 
method comprising: (a) expressing a nucleic acid consisting of a reporter gene operably 
Imked to one, two, three or more untranslated regions of the target gene in a cell; (b) 
contacting said cell with a member of a library of compounds; and (c) detecting the 
expression of said reporter gene, wherein a compound that modulates untranslated region- 
dependent regulation of expression is identified if the expression of said reporter gene in the 
presence of a compound is altered as compared to the expression of said reporter gene in the 
absence of said compound or the presence of a control (e.g., PBS). In an alternative 
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embodiment, the invention provides a method for identifying a compound that modulates 
untranslated region-dependent expression of a target gene, said method comprising: (a) 
expressing a nucleic acid consisting of a reporter gene operably linked to one, two, three or 
more untranslated regions of the target gene in a cell; (b) contacting said cell with a member 
of a library of compounds; and (c) detecting the expression of said reporter gene, wherein a 
compound that modulates untranslated region-dependent expression is identified if the 
expression of said reporter gene in the presence of the compound is altered relative to a 
previously determined reference range. 

[0022] hi another embodiment, the invention provides a method for identifying a 
compound that modulates untranslated region-dependent expression of a target gene, said 
method comprising: (a) contacting a member of a library of compounds with a cell-free 
translation mixture and a nucleic acid comprising a reporter gene operably linked to one, 
two, three or more untranslated regions of said target gene; and (b) detecting the expression 
of said reporter gene, wherein a compound that modulates untranslated region-dependent 
expression is identified if the expression of said reporter gene in the presence of a 
compound is altered as compared to the expression of said reporter gene in the absence of 
said compound or the presence of a control (e.g., PBS), hi aa alternative embodiment, the 
invention provides a method for identifying a compound that modulates untranslated 
region-dependent expression of a target gene, said method comprising: (a) contacting a 
member of a library of compounds with a cell-free translation mixture and a nucleic acid 
comprising a reporter gene operably linked to one, two, tliree or more untranslated regions 
of said target gene; and (b) detecting the expression of said reporter gene, wherein a 
compound that modulates untranslated region-dependent expression of a target gene is 
identified if the expression of said reporter gene in the presence of a compound is altered 
relative to a previously determined reference range. 

[0023] In another embodiment, the invention provides a method for identifying a 
compound that modulates untranslated region-dependent expression of a target gene, said 
method comprising: (a) contacting a member of a library of compounds with a cell 
containing a nucleic acid comprising a reporter gene operably linked to one, two, three or 
more untranslated regions of said target gene; and (b) detecting a reporter protein translated 
from said reporter gene, wherein a compoimd that modulates unfranslated region-dependent 
expression is identified if the expression of said reporter gene in the presence of a 
compound is altered as compared to the expression of said reporter gene in the absence of 
said compound or the presence of a control (e.g., PBS), hi an ahemative embodiment, the 
invention provides a method for identifymg a compound that modulates untranslated 
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region-dependent expression of a target gene, said method comprising: (a) contacting a 
member of a library of compounds with a cell containing a nucleic acid comprising a 
reporter gene operably linked to one, two, three or more untranslated regions of said target 
gene; and (b) detecting expression of said reporter gene, wherein a compound that 
modulates untranslated region-dependent expression is identified if the expression of said 
reporter gene in the presence of a compound is altered relative to a previously determined 
reference range. 

[0024] The invention also provides methods of identifying compounds that upregulate 
untranslated region-dependent expression of a target gene utlitizing the reporter gene-based 
assays described herein. In a specific embodiment, the invention provides a method of 
upregulating untranslated region-dependent expression of a target gene, said method 
comprising (a) contacting a compound with a cell containing a nucleic acid comprising a 
reporter gene operably linked to one, tw'o, three or more untranslated regions of said target 
gene; and (b) detecting a reporter gene protein translated from said reporter gene, wherein a 
compound that upregulates untranslated region dependent expression is identified if the 
expression of said reporter gene in the presence of a compound is increased relative to the 
absence of the compound or a previously determined reference range. In another 
embodiment, the invention provides a method of upregulating untranslated region- 
dependent expression of a target gene, said method comprising (a) contacting a compound 
with a cell-free translation mixture and a nucleic acid comprising a reporter gene operably 
linked to one, two, tlu-ee or more untranslated regions of said target gene; and (b) detecting 
a reporter gene protein translated from said reporter gene, wherein a compound that 
upregulates untranslated region dependent expression is identified if the expression of said 
reporter gene in the presence of a compound is increased relative to the absence of the 
compound or a previously determined reference range. 

[0025] The invention also provides methods of identifying compounds that down- 
regulate untranslated region-dependent expression of a target gene utlitizing the reporter 
gene-based assays described herein, hi a specific embodiment, the invention provides a 
method of down-regulating untranslated region-dependent expression of a target gene, said 
method comprising (a) contacting a compound with a cell containing a nucleic acid 
comprising a reporter gene operably linlced to one, two, three or more untranslated regions 
of said target gene; and (b) detecting a reporter gene protein translated from said reporter 
gene, wherein a compound that down-regulates untranslated region dependent expression is 
identified if the expression of said reporter gene in the presence of a compound is decreased 
relative to the absence of the compound or a previously determined reference range. In 
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anotiier embodiment, the invention provides a method of down-regulating untranslated 
region-dependent expression of a target gene, said method comprising (a) contacting a 
compound with a cell-free translation mixture and a nucleic acid comprising a reporter gene 
operably linked to one, two, three or more untranslated regions of said target gene; and (b) 
detecting a reporter gene protein translated from said reporter gene, wherein a compound 
that down-regulates untranslated region dependent expression is identified if the expression 
of said reporter gene in the presence of a compound is decreased relative to the absence of 
the compound or a previously determined reference range. 

[0026] hi accordance with the invention, the step of contacting a compound with a cell, 
or cell-free translation mixture and a nucleic acid in the reporter gene-based assays 
described herein is preferably conducted in an aqueous solution comprising a buffer and a 
combination of salts, hi a specific embodiment, the aqueous solution approximates or 
mimics physiologic conditions. In another specific embodiment, the aqueous solution 
further comprises a detergent or a surfactant. 

[0027] The present invention provides methods of identifying environmental stimuli 
(e.g., exposure to different concentrations of CO2 and/or O2, stress, temperature shifts, and 
different pHs) that modulate untranslated region-dependent expression of a target gene 
utilizuig the reporter gene-based assays described herein, hi particular, the invention 
provides a method of identifying an environmental stimulus, said method comprising (a) 
contacting a cell containing a nucleic acid comprising a reporter gene operably linked to 
one, two, three or more untranslated regions of said target gene with an environmental 
stimulus; and (b) detecting a reporter gene protein translated from said reporter gene, 
wherein a compound that modulates untanslated region dependent expression is identified 
if the expression of said reporter gene in the presence of an environmental stimuU is altered 
relative to the absence of the compound or a previously detennined reference range, hi a 
specific embodiment, the environmental stimuli is not hypoxia. In another embodiment, the 
environmental stimuh does not include a compound. 

[0028] The reporter gene constructs utilized in the reporter gene-based assays described 
herein may comprise a 5' untranslated region ("UTR") of a target gene, a 3' UTR of a target 
gene, or a 5' UTR and a 3' UTR of a target gene operably linked to a reporter gene, hi a 
specific embodiment, a reporter gene construct utilized in the reporter gene-based assays 
described herein comprises a 5' UTR of a target gene with a stable hairpin secondary 
structure operably linked to a reporter gene. In a preferred embodiment, a reporter gene 
consfruct utihzed in the reporter gene-based assays described herein comprises a 5' UTR 
and a 3 ' UTR of a target gene. The unfranslated regions of a target gene utihzed to 
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construct a reporter gene construct may comprise one or more of the following elements: an 
iron response element ("IRE"), Internal ribosome entry site ("IRES"), upstream open 
reading frame ("uORF"), male specific lethal element ("MSL-2"), G quartet element, 
5 '-terminal oligopyrimidine tract ("TOP"), AU-rich element ("ARE"), selenocysteine 
insertion sequence ("SECIS"), histone stem loop, cytoplasmic polyadenylation element 
("CPE"), nanos translational control element, amyloid precursor protein element ("APP"), 
translational regulation element ("TGE")/direct repeat element ("DRE"), Bruno element 
("BRE"), and a 15 -lipoxygenase differentiation control element ("15-LOX-DICE"). 
[0029] In addition to untranslated regions, the reporter gene constructs utiHzed in the 
reporter gene-based assays described herein may comprise one, two, three or more introns 
within the open reading frame ("ORE") of the reporter gene. Further, the 3' end of a 
reporter gene may be polyadenylated and/or the 5 ' end may be capped. In a specific 
embodiment, the 5' end of the reporter gene is not capped. 

[0030] The reporter gene consfructs utilized m the reporter gene-based assays described 
herein may comprise an untranslated region of a gene whose expression is associated with 
or has been linlced to the onset, development, progression or severity of a particular disease 
or disorder. Alternatively, the reporter gene constructs utiHzed in the reporter gene-based 
assays described herein may comprise an untranslated region of a gene whose expression is 
beneficial to a subject with a particular disease or disorder. Examples of genes from which 
the untranslated regions maybe obtained include, but are not limited, the gene encoding 
tumor necrosis factor alpha ("TNE-a"), the gene encoding granulocyte-macrophage colony 
stimulating factor ("GM-CSF"), the gene encoding granulocyte colony stimulating factor 
("G-CSF"), the gene encoding interieukin 2 ("IL-2"), the gene encoding interieukin 6 ("IL- 
6"), the gene encoding vascular endothelial growth factor ("VEGF"), the genome encoding 
hepatitis C virus ("HCV"), the gene encoding survivin, or the gene encoding Her-2. In a 
specific embodiment, an untranslated region is obtained or derived from Her-2 and/or 
VEGF. In another embodiment, an untranslated region is not obtained or derived from the 
gene encoding Her-2. In another embodiment, an unfranslated region is not obtained or 
derived from the gene encoding VEGF. hi another embodiment, an unfranslated region is 
not obtained or derived from the genes encoding VEGF and Her-2. 
[003 1 ] Any reporter gene well-known to one of skill in the art may be utiHzed in the 
reporter gene constructs described herein, Examples of reporter genes include, but are not 
limited to, the gene encoding firefly luciferase, the gene coding renilla luciferase, the genes 
encoding click beetie luciferase, the gene encoding green fluorescent protein, the gene 
encoding yellow fluorescent protein, the gene encoding red fluorescent protein, the gene 
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eacoding cyan fluorescent protein, the gene encoding blue fluorescent protein, the gene 
encoding beta-galactosidase, the gene encoding beta-glucoronidase, the gene encoding 
beta-lactamase, the gene encoding chloramphenicol acetyltransferase, and the gene 
encoding alkaline phosphatase. 

[0032] The reporter gene-based assays described herein may be conducted in a cell 
genetically engineered to express a reporter gene or in vitro utilizing a cell-free translation 
mixture. Any cell or cell line of any species well-known to one of skill in the art may be 
utilized in accordance with the methods of the invention. Further, a cell-free translation 
mixture may be derived from any cell or cell line of any species well-known to one of skill 
in the art. Examples of cells and cell types include, but are not limited to, human cells (e.g., 
HeLa cells and 293 cells), yeast, mouse cells (e.g. , cultured mouse cells), rat cells {e.g , 
cultured rat cells), Chinese hamster ovary ("CHO") cells, Xenopus oocytes, cancer cells 
(e.g., undifferentiated cancer cells), primary ceUs, reticulocytes, wheat genn, rye embryo, or 
bacterial cells. 

[0033] The compounds utilized in the reporter gene-based assays described herein may 
be members of a library of compounds. In specific embodiment, the compound is selected 
from a combinatorial library of compounds comprising peptoids; random biooHgomers; 
diversomers such as hydantoins, benzodiazepines and dipeptides; vinylogous polypeptides; 
nonpeptidal peptidomimetics; oligocarbamates; peptidyl phosphonates; peptide nucleic acid 
libraries; antibody libraries; carbohydrate libraries; and small organic molecule libraries. In 
a preferred embodiment, the small organic molecule libraries are libraries of 
benzodiazepines, isoprenoids, thiazohdinones, metathiazanones, pyrrolidines, morpholino 
compounds, or diazepindiones. 

[0034] Once a compound that modulates untranslated region-dependent expression of a 
target gene is identified, the structure of the compound may be determined utilizing well- 
known techniques or by referring to a predetermined code. For example, the structure of 
the compound may be determmed by mass spectroscopy, NMR, vibrational spectroscopy, 
or X-ray crystallography. 

[0035] A compound identified in accordance with the methods of the invention may 
directly bind to an RNA franscribed from a target gene. Alternatively, a compound 
identified in accordance with the methods of invention may bind to one or more trans- 
acting factors (such as, but not limited to, proteins) that modulate untranslated region- 
dependent expression of a target gene. Further, a compound identified in accordance with 
the methods of invention may disrupt an interaction between the 5 ' UTR and the 3' UTR. 
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[0036] In a specific embodiment, a compound identified in accordance with the 
methods of the invention reduces the ti-anslation efficiency and/or stability of an mRNA 
transcribed from a target gene by at least 25%, at least 30%, at least 35%, at least 40%, at 
least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, 
at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%, or at least 1.5 fold, at 
least 2 fold, at least 2.5 fold, at least 5 fold, at least 7.5 fold or at least 10 fold relative to a 
control (e.g., PBS), the absence of a control or a previously determined reference range in 
an assay described herein or well-lcnown in the art. hi another embodiment, a compound 
identified in accordance with the methods of the invention reduces the translation efficiency 
and/or stability of an mRNA transcribed from a target gene by at least 25%, at least 30%), at 
least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, 
at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 
99%, or at least 1.5 fold, at least 2 fold, at least 2.5 fold, at least 5 fold, at least 7.5 fold or at 
least 10 fold relative to a confrol (e.g., PBS), the absence of a control or a previously 
detennined reference range in an assay described herein or well-known in the art. 
[0037] A compoimd that modulates untranslated region-dependent expression in a 
reporter gene-based assay described herein may be subsequently tested in m vitro assays 
(e.g., cell-free assays) or in vivo assays (e.g., cell-based assays) well-known to one of skill 
in the art or described herein for the effect of said compound on the expression of the target 
gene from which the untranslated regions of the reporter gene construct were derived. 
Further, to assess the specificity of a particular compound's effect on unfranslated region- 
dependent expression of a target gene, the effect of said compound on the expression of one 
or more genes (preferably, a plurality of genes) can be determined utilizing assays well- 
known to one of skill in the art or described herein, hi a preferred embodiment, a 
compound identified utiUzing the reporter gene-based assays described herein has a specific 
effect on the expression of only one gene or a group of genes within the same signaling 
pathway. 

[0038] ha a specific embodiment, the specificity of a particular compound for an 
untranslated region of a target gene is determined by (a) contacting the compound of 
interest with a cell containing a nucleic acid comprising a reporter gene operably linked to 
an UTR of a different gene; and (b) detecting a reporter gene protein translated from the 
reporter gene, wherein the compound is specific for the untranslated region of the target 
gene if the expression of said reporter gene in the presence of the compound is not altered 
or is not substantially altered relative to a previously detennined reference range, or the 
expression in the absence of the compound or the presence of a control (e.g., PBS), hi 
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another embodiment, the specificity of a particular compound for an untranslated region of 
a target gene is determined by (a) contacting the compound of interest with a panel of cells, 
each cell in a different well of a container (e.g., a 48 or 96 well microtiter plate) and each 
cell containing a nucleic acid comprising a reporter gene operably linked to an UTR of a 
different gene; and (b) detecting a reporter gene protein translated from the reporter gene, 
wherein the compound is specific for the untranslated region of the target gene if the 
expression of said reporter gene in the presence of the compound is not altered or is not 
substantially altered relative to a previously determined reference range, or the expression 
in the absence of the compound or the presence of a control (e.g., PBS), hi accordance with 
this embodiment, the panel may comprise 5, 7, 10, 15, 20, 25, 50, 75, 100 or more cells, hi 
another embodiment, the specificity of a particular compound for an untranslated region of 
a target gene is determined by (a) contacting the compound of interest with a cell-fi:ee 
translation mixtm-e and a nucleic acid comprising a reporter gene operably linked to an 
UTR of a different gene; and (b) detecting a reporter gene protein translated fi-om the 
reporter gene, wherein the compound is specific for the untranslated region of the target 
gene if the expression of said reporter gene in the presence of the compound is not altered 
or is not substantially altered relative to a previously determined reference range, or the 
expression in the absence of the compound or the presence of a control (e.g. , PBS). As 
used herein, the term "not substantially altered" means that the compound alters the 
expression of the reporter gene or target gene by less than 20%, less than 15%, less than 
10%, less than 5%, or less than 2% relative to a negative control such as PBS. 
[0039] The invention provides for methods for treating, preventing or ameliorating one 
or more symptoms of a disease or disorder associated with the aberrant expression of a 
target gene, said method comprising administering to a subject in need thereof a 
therapeutically or prophylactically effective amount of a compound, or a pharmaceutically 
acceptable salt thereof, identified according to the methods described herein. In one 
embodiment, the target gene is aben-antly overexpressed. hi another embodiment, the target 
gene is expressed at an aberrantly low level. In particular, the invention provides for a 
method of treating or preventing a disease or disorder or ameUorating a symptom thereof, 
said method comprising administering to a subject in need thereof an effective amount of a 
compound, or a pharmaceutically acceptable salt thereof, identified according to the 
methods described herein, wherein said effective amount increases the expression of a 
target gene beneficial in the treatment or prevention of said disease or disorder. The 
invention also provides for a method of treating or preventing a disease or disorder or 
ameliorating a symptom thereof, said method comprising administering to a subject in need 
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thereof an effective amount of a compound, or a pharmaceutically acceptable salt thereof, 
identified according to the methods described herein, wherein said effective amount 
decreases the expression of a target gene whose expression is associated with or has been 
linked to the onset, development, progression or severity of said disease or disorder. In a 
specific embodiment, the disease or disorder is aproUferative disorder, an inflammatory 
disorder, an infectious disease, a genetic disorder, an autoimmmie disorder, a cardiovascular 
disease, or a central nervous system disorder. In an embodiment wherein the disease or 
disorder is an infectious disease, the infectious disease can be caused by a fungal infection, 
a bacterial infection, a viral infection, or an infection caused by another type of pathogen. 
[00401 The invention provides a method for identifying a compound that inhibits or 
reduces angiogenesis, said method comprising: (a) contacting a member of a library of 
compounds with a cell containing a nucleic acid comprising a reporter gene operably linked 
to one, two, three or more unhanslated regions of a target gene; and (b) detecting the 
expression of said reporter gene, wherein if a compound that reduces the expression of said 
reporter gene relative to a previously determined reference range, or to the expression of 
said reporter gene in the absence of said compound or in the presence of a control (e.g., 
PBS) is detected in (b), then (c) contacting the compound with a tumor cell and detecting 
the proHferation of said tumor cell, so that if the compound reduces or inhibits the 
proliferation of the tumor cell, the compound is identified as a compomid that inlaibits or 
reduces angiogenesis. The invention provides a method for identifying a compound that 
inhibits or reduces angiogenesis, said method comprising: (a) contacting a cell-free 
tr^islation mixture with a member of a library of compounds and a nucleic acid comprising 
a reporter gene operably linked to one, two, tliree or more untranslated regions of a target 
gene; and (b) detecting the expression of said reporter gene, wherein if a compound that 
reduces the expression of said reporter gene relative to a previously detennined reference 
range, or to the expression of said reporter gene in the absence of said compound or in the 
presence of a control (e.g., PBS) is detected in (b), then (c) contacting the compound with a 
tumor cell and detecting the proliferation of said tumor cell, so that if the compomid reduces 
or inhibits the proliferation of the tumor cell, the compomid is identified as a compound that 
inhibits or reduces angiogenesis. hi a specific embodiment, the compomid is fiirther tested 
in an animal model for angiogenesis by, e.g., admmistering said compound to said animal 
model and verifying that angiogenesis is inhibited by said compound in said animal model, 
hi a preferred embodiment, the target gene is VEGF. hi another embodiment, the 
compound identified in accordance with the methods of the invention inhibits or reduces 
angiogenesis by at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 
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50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%., at 
least 85%, at least 90%, at least 95% or at least 99%, or at least 1.5 fold, at least 2 fold, at 
least 2.5 fold, at least 5 fold, at least 7.5 fold, or at least 10 fold relative to a control (e.g., 
PBS) in an assay described herein or well-known in the art. 

[0041] The invention provides for a method for identifying a therapeutic agent for the 
treatment or prevention of cancer, or ameUoration of a symptom thereof, said method 
comprising: (a) contacting a member of a library of compomids with a cell containing a 
nucleic acid comprising a reporter gene operably linked to one, two, three or more 
untranslated regions of a target gene; and (b) detecting the expression of said reporter gene, 
wherein if a compound that reduces the expression of said reporter gene relative to a 
previously determined reference range, or the expression of said reporter gene in the 
absence of said compound or the presence of a control (e.g., PBS) is detected in (b), then (c) 
contacting the compound with a cancer cell and detecting the proliferation of said cancer 
cell, so that if tire compound reduces or inhibits the prohferation of the cancer cell, the 
compound is identified as a therapeutic agent for the treatment or prevention of cancer, or 
amehoration of a symptom thereof The invention also provides for a method for 
identifying a therapeutic agent for the treatment or prevention of cancer, or amelioration of 
a symptom thereof, said method comprising: (a) contacting a cell-free translation mixture 
with a member of a library of compounds and a nucleic acid comprising a reporter gene 
operably linked to one, two, three or more untranslated regions of a target gene; and (b) 
detecting the expression of said reporter gene, wherein if a compound that reduces the 
expression of said reporter gene relative to a previously deteraiined reference range, or the 
expression of said reporter gene in the absence of said compound or the presence of a 
control (e.g., PBS) is detected in (b), then (c) contacting the compound with a cancer cell 
and detecting the prohferation of said cancer cell, so that if the compound reduces or 
inhibits the proliferation of the cancer cell, the compound is identified as a therapeutic agent 
for the treatment or prevention of cancer, or amelioration of a symptom thereof hi a 
specific embodiment, the compound is further tested in an animal model for cancer by, e.g., 
administering said compound to said animal model and verifying that the compound is 
effective in reducing the prohferation or spread of cancer cells in said animal model. In a 
preferred embodiment, the target gene is sur\dvin. 

[0042] In a specific embodiment, the invention provides for a method of identifying a 
therapeutic agent for the treatment or prevention of breast cancer, or amehoration of a 
symptom thereof, said method comprising: (a) contacting a member of a library of 
compounds with a cell containing a nucleic acid comprising a reporter gene operably linked 
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to one, two, three or more untranslated regions of a target gene; and (b) detecting tlie 
expression of said reporter gene, wherein if a compoimd that reduces tiie expression of said 
reporter gene relative a previously determined reference range, or the expression of said 
reporter gene in the absence of said compound or the presence of a control is detected in 
(b), then (c) contacting the compound with a breast cancer cell and detecting the 
proliferation of said breast cancer cell, so that if the compound reduces or inhibits the 
proliferation of the breast cancer cell, the compound is identified as a therapeutic agent for 
the treatment or prevention of breast cancer, or amelioration of a symptom thereof In 
another embodiment, the invention provides for a method of identifying a therapeutic agent 
for the treatment or prevention of breast cancer, or ameUoration of a symptom thereof, said 
method comprising: (a) contacting a cell-free translation mixture with a member of a library 
of compounds and a nucleic acid comprising a reporter gene operably linked to one, two, 
three or more untranslated regions of a target gene; and (b) detecting the expression of said 
reporter gene, wherein if a compound that reduces the expression of said reporter gene 
relative to a previously determined reference range, or the expression of said reporter gene 
in the absence of said compoimd or the presence of a control is detected in (b), then (c) 
contacting the compound with a breast cancer cell and detecting the proliferation of said 
breast cancer cell, so that if the compound reduces or inhibits the proUferation of the breast 
cancer cell, the compound is identified as a therapeutic agent for the treatment or prevention 
of breast cancer, or amelioration of a symptom thereof In accordance with these 
embodiments, the compound may be further tested in an animal model for breast cancer by, 
e.g., administering said compound to said animal model and verifying that the compound is 
effective in reducing the prohferation or spread of breast cancer cells in said animal model. 
In a preferred embodiment, the target gene is Her-2. 

[0043] The invention also provides methods for upregulating or downregulating the 
expression of a target gene utihzing a compound identified in accordance with the methods 
described herein. The upregulation or downi-egulation of a target gene is particularly useful 
in vitro when attempting to produce a protein encoded by said target gene for use as a 
therapeutic or prophylactic agent, or in experiments conducted to, e.g., identify the function 
or efficacy of said protein, hi particular, the invention provides a method of modulating the 
expression of a target gene, said method comprising contacting a cell with an effective 
amount of a compound or pharmaceutically acceptable derivative thereof, identified 
according to the methods described herein, hi one embodiment, the cell is a eucaryotic cell. 
In another embodiment, the cell is aprocaryotic cell. 
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[0044] The invention further provides methods for verifying or confirming the abiUty of 
a compound to modulate untranslated region-dependent expression of a target gene. The 
ability of a compound to modulate untranslated region-dependent expression of a target 
gene can be verified or confirmed utiHzing any of the assays described herein to identify 
such a compound. In a first embodiment, the invention provides a method for verifying the 
ability of a compound to modulate untranslated region-dependent expression of a target 
gene, said method comprising: (a) expressing a nucleic acid comprising a reporter gene 
operably linked to one, two, three or more untranslated regions of said target gene in a cell; 
(b) contacting said cell with a compound; and (c) detecting the expression of said reporter 
gene, wherein a compound that modulates untranslated region-dependent expression is 
verified if the expression of said reporter gene in the presence of a compound is altered as 
compared to a previously determined reference range or the expression of said reporter gene 
in the absence of said compound or the presence of a control. 

[0045] In a second embodiment, the invention provides a method for verifying the 
ability of a compound to modulate untranslated region-dependent expression of a target 
gene, said method comprising: (a) contacting a compound with a cell-fi:ee translation 
mixture and a nucleic acid comprising a reporter gene operably linked to one, two, three or 
more untranslated regions of said target gene; and (b) detecting the expression of said 
reporter gene, wherein a compound that modulates untranslated region-dependent 
expression is verified if the expression of said reporter gene in the presence of a compound 
is altered as compared to a previously detennmed reference range or the expression of said 
reporter gene in the absence of said compound or the presence of a control. 
[0046] In a third embodiment, the invention provides a method for verifying the ability 
of a compound to modulate untranslated region-dependent expression of a target gene, said 
method comprising: (a) contacting a compound with a cell containing a nucleic acid 
comprising a reporter gene operably linlced to one, two, three or more untranslated regions 
of said target gene; and (b) detecting the expression of said reporter gene, wherein a 
compound that modulates untranslated region-dependent expression is verified if the 
expression of said reporter gene in the presence of a compound is altered as compared to a 
previously determined reference range or the expression of said reporter gene in the absence 
of said compound or the presence of a control. 

3.1. Terminology 

[0047] As used herein, the term "5 ' cap" refers to a methylated guanine cap, e.g., a 7 
methylguanosine (5 '-5') RNA triphosphate, that is added to the 5' end of apre-mRNA. 
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[0048] As used herein, the tenn "ARE" refers to aa adenylate uridylate rich element m 
the the 3' UTR of a mRNA. 

[0049] As used herein, the term "compound" refers to any agent or complex that is 
being tested for its ability to modulate untranslated region-dependent expression of a target 
gene, or any agent or complex identified by the methods described herein. Examples of 
compounds include, but are not limited to, proteins, polypeptides, peptides, peptide analogs 
(including peptides comprising non-naturally occurring amino acids, e.g., D-amino acids, 
phosphorous analogs of amino acids, such as a-amino phosphoric acids and a-amino 
phosphoric acids, or amino acids having non-peptide linkages), nucleic acids, nucleic acid 
analogs such as phosphorothioates and PNAs, hormones, antigens, antibodies, lipids, fatty 
acids, synthetic or naturally occurring drugs, opiates, dopamine, serotonin, catecholamines, 
thrombin, acetylcholine, prostaglandins, organic molecules, pheromones, adenosine, 
sucrose, glucose, lactose and galactose. 

[0050] As used herein, the term "CUG repeat" refers to a repeat of a cytosine-uracil- 
guanine triplet in the 3 ' UTR of a mRNA. 

[0051] As used herein, the term "cytosine rich element" refers to cytosine-rich stability 
determinant sequences in the 3' UTR of a mRNA. 

[0052] As used herein, the terms "disorder" and "disease" refer to a condition in a 
subject. 

[0053] As used herein, the term "effective amount" refers to the amount of a compound 
which is sufficient to reduce or amehorate the severity, duration and/or a disease or disorder 
or a symptom theroef, prevent the advancement of a disease or disorder, cause regression of 
a disease or disorder, prevent the recurrence, development, or onset of one or more 
symptoms associated with a disease or disorder, or enhance or improve the prophylactic or 
therapeutic effect(s) of another therapy (e.g., prophylactic or therapeutic agent). 
[0054] As used herein, the term "fragment" refers to a nucleotide sequence comprising 
an nucleic acid sequence of at least 5 contiguous nucleic acid residues, at least 10 
contiguous nucleic acid residues, at least 15 contiguous nucleic acid residues, at least 20 
contiguous nucleic acid residues, at least 25 contiguous nucleic acid residues, at least 40 
contiguous nucleic acid residues, at least 50 contiguous nucleic acid residues, at least 60 
contiguous nucleic acid residues, at least 70 contiguous nucleic acid residues, at least 
contiguous 80 nucleic acid residues, at least contiguous 90 nucleic acid residues, at least 
contiguous 100 nucleic acid residues, at least contiguous 125 nucleic acid residues, at least 
150 contiguous nucleic acid residues, at least contiguous 175 nucleic acid residues, at least 
contiguous 200 nucleic acid residues, or at least contiguous 250 nucleic acid residues of the 
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nucleotide sequence of untranslated region of a target gene. In a specific embodiment, a 
fragment of a untranslated region of a target gene retains at least one element of the 
untranslated region (e.g., an IRES). 

[0055] As used herein, the term "target RNA" refers to an RNA of interest, i. e. , the 
RNA transcribed from a target gene or a gene of interest. In a preferred embodiment, the 
target RNA contains one or more untranslated regions, and more preferably, contains at 
least one element of the unfranslated region (e.g., an IRES). 
[0056] As used herein, the term "host cell" includes a particular subj ect cell 
fransformed or transfected with a nucleic acid molecule and the progeny or potential 
progeny of such a cell. Progeny of such a cell may not be identical to the parent cell 
transfected with the nucleic acid molecule due to mutations or environmental influences 
that may occur in succeeding generations or integration of the nucleic acid molecule into 
the host cell genome. 

[0057] As used herein, the tenn "in combination" refers to the use of more than one 
therapies (e.g., prophylactic and/or therapeutic agents). The use of the term "in 
combination" does not restrict the order in which therapies (e.g., prophylactic and/or 
therapeutic agents) are administered to a subject with a particular disease or disorder. A 
fu:st therapy (e.g., a prophylactic or therapeutic agent such as, e.g., a compound identified in 
accordance with the methods of the invention) can be administered prior to (e.g., 5 minutes, 
15 minutes, 30 minutes, 45 minutes, 1 hour, 2 hours, 4 hours, 6 hours, 12 hours, 24 hours, 
48 hours, 72 hours, 96 hours, 1 week, 2 weeks, 3 weeks, 4 weeks, 5 weeks, 6 weeks, 8 
weeks, or 12 weeks before), concomitantly with, or subsequent to (e.g., 5 minutes, 15 
minutes, 30 minutes, 45 minutes, 1 hour, 2 hours, 4 hours, 6 hours, 12 hours, 24 hours, 48 
hours, 72 hours, 96 hours, 1 week, 2 weeks, 3 weeks, 4 weeks, 5 weeks, 6 weeks, 8 weeks, 
or 12 weeks after) the administration of a second therapy (e.g., a prophylactic or therapeutic 
agent such as, e.g., a chemotherapeutic agent, an anti-inflammatory agent or a TNF-a 
antagonist) to a subject with a particular disease or disorder. 

[0058] As used herein, the term "IRE" refers to an iron response element in the 5 ' UTR 
or3'UTRofamRNA. 

[0059] As used herein, the term "IRES" refers to an internal ribosome enfry site in the 
5'UTRofamRNA. 

[0060] As used herein, the term "library' ' refers to a pluraUty of compounds. A library 
can be a combinatorial library, e.g., a collection of compounds synthesized using 
combinatorial chemistry techniques, or a collection of unique chemicals of low molecular 
weight (less than 1000 daltons) that each occupy a unique three-dimensional space. 
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[0061] As used herein, the terai "ORF" refers to the open reading frame of a mRNA, 
i.e., the region of the mRNA that is translated into protein. 

[0062] As used herein, the terms "non-responsive" and "refractory" describe patients 
freated with a currently available therapy {e.g., a. prophylactic or therapeutic agent) for a 
disease or disorder, which is not chnically adequate to relieve one or more symptoms 
associated with such disease or disorder. Typically, such patients suffer from severe, 
persistently active disease and reqmre additional therapy to ameliorate the symptoms 
associated with their disease or disorder. 

[0063] As used herein, the phrase "pharmaceutically acceptable sah(s)," includes, but is 
not limited to, sahs of acidic or basic groups that may be present in compounds identified 
using the methods of the present invention. Compounds that are basic in nature are capable 
of forming a wide variety of salts with various inorganic and organic acids. The acids that 
can be used to prepare pharmaceutically acceptable acid addition salts of such basic 
compounds aie those that form non-toxic acid addition saUs, i.e., sahs containing 
pharmacologically acceptable anions, including but not limited to sulftiric, citric, maleic, 
acetic, oxalic, hydrochloride, hydrobromide, hydroiodide, nitrate, sulfate, bisulfate, 
phosphate, acid phosphate, isonicotinate, acetate, lactate, saHcylate, citrate, acid citrate, 
tartrate, oleate, tannate, pantothenate, bitartrate, ascorbate, succinate, maleate, gentisinate, 
fimiarate, gluconate, glucaronate, saccharate, formate, benzoate, glutamate, 
methanesulfonate, ethanesulfonate, benzenesulfonate, p-toluenesulfonate and pamoate {i.e., 
l,l'-methylene-bis-(2-hydroxy-3-naphthoate)) salts. Compounds that include an amino 
moiety may form pharmaceutically acceptable salts with various amino acids, in addition to 
the acids mentioned above. Compounds that are acidic m nature are capable of forming 
base sahs with various pharmacologically acceptable cations. Examples of such salts 
include alkah metal or alkaline earth metal salts and, particularly, calcium, magnesium, 
sodium lithium, zinc, potassium, and iron saUs. 

[0064] As used herein, the term "poly(A) tail" refers to a polyadenyUc acid tail that is 
added to the 3 ' end of a pre-mRNA. 

[0065] As used herein, the tenms "prophylactic agent" and "prophylactic agents" refer 
to any agent(s) which can be used in the prevention of a particular disease or disorder. In 
certain embodiments, the term "prophylactic agent" refers to a compoimd identified in the 
screening assays described herein, hi certain other embodiments, the term "prophylactic 
agent" does not refer a compound identified m the screening assays described herein. 
[0066] As used herein, the phrase "prophylactically effective amount" refers to the 
amount of a therapy {e.g. , a prophylactic agent) which is sufficient to result m the 
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prevention of the development, recurrence or onset of a disease or disorder or one or more 
symptoms associated thereof. 

[0067] As used herein, the terms "prevent", " preventing" and "prevention" refer to the 
prevention of the development, recurrence or onset of a disease or disorder or one or more 
symptoms thereof resulting from the administration of one or more compounds identified in 
accordance the methods of the invention or the administration of a combiaation of such a 
compound and a known therapy for a particular disease or disorder. 
[0068] As used herem, the term "previously determined reference range" refers to a 
reference range for the expression and/or the activity of a reporter gene or a target gene by a 
particular cell or in a particular cell-free translation mixture. Each laboratory will estabhsh 
its own reference range for each particular assay, each cell type and each cell-free 
translation mixture. In a prefeixed embodiment, at least one positive confrol and at least 
one negative control are included in each batch of compounds analyzed. 
[0069] As used herein, the term "small molecules" and analogous terms include, but are 
not limited to, peptides, peptidomimetics, amino acids, amino acid analogs, 
polynucleotides, polynucleotide analogs, nucleotides, nucleotide analogs, organic or 
iaorganic compounds (i.e,. including heteroorganic and organometaUic compounds) having 
a molecular weight less than about 10,000 grams per mole, organic or inorganic compounds 
having a molecular weight less than about 5,000 grams per mole, organic or inorganic 
compounds having a molecular weight less than about 1,000 grams per mole, organic or 
inorganic compounds having a molecular weight less than about 500 grams per mole, 
organic or inorganic compounds having a molecular weight less than about 100 grams per 
mole, and salts, esters, and other pharmaceutically acceptable forms of such compounds. 
Salts, esters, and other pharmaceutically acceptable forms of such compounds are also 
encompassed. 

[0070] As used herem, the terms "subject" and "patienf are used interchangeably 
herehi. The terms "subject" and "subjects" refer to aa animal, preferably a mammal 
including a non-primate (e.g., a cow, pig, horse, cat, dog, rat, and mouse) and a primate 
(e.g., a monkey such as a cynomolgous monkey and a human), and more preferably a 
human. In one embodiment, the subject is refractory or non-responsive to current therapies 
for a disease or disorder (e.g., viral infections, fungal infections, bacterial infections, 
proUferative diseases or inflammatory diseases). In another embodiment, the subject is a 
fann animal (e.g., a horse, a cow, a pig, etc.) or a household pet (e.g., a dog or a cat). Li a 
preferred embodiment, the subject is a himian. 
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[0071] As used herein, the term "synergistic" refers to a combination of a compound 
identified using one of the methods described herein, and another therapy (preferably, a 
therapy which has been or is currently being used to prevent or treat a particular disease or 
disorder) which is more effective than the additive effects of the therapies. A synergistic 
effect of a combination of therapies (e.^., prophylactic or therapeutic agents) permits the 
use of lower dosages of one or more of the therapies and/or less frequent administration of 
said therapies to a subject with a particular disease or disorder. The ability to utihze lower 
dosages of a therapy {e.g., prophylactic or therapeutic agent) and/or to administer said 
therapy less frequently reduces the toxicity associated with the administration of said 
therapy to a subject without reducing the efficacy of said therapies in the prevention or 
treatment of a particular disease or disorder, hi addition, a synergistic effect can result in 
improved efficacy of Iherapies in the prevention or treatment of a particular disease or 
disorder. Finally, a synergistic effect of a combination of therapies {e.g., prophylactic or 
therapeutic agents) may avoid or reduce adverse or unwanted side effects associated with 
the use of either therapy alone. 

[00721 As used herein, the term "target gene" refers to a gene or nucleotide sequence 
encoding a protein or polypeptide of interest. In a prefeiTed embodiment, tlie gene or 
nucleotide sequence comprises an mitranslated region. 

[0073] As used herein, a "target nucleic acid" refers to RNA, DNA, or a chemically 
modified variant thereof hi a prefeaed embodiment, the target nucleic acid is RNA. hi a 
preferred embodiment, the target nucleic acid refers to the untranslated region of an mRNA, 
such as, but not limited to, a 5' UTR and a 3' UTR. hi another embodiment, the tai-get 
nucleic acid refers to an open reading frame of an mRNA. A target nucleic acid also refers 
to tertiary structures of the nucleic acids, such as, but not limited to loops, bulges, 
pseudoknots, guanosine quartets and turns. A target nucleic acid also refers to RNA 
elements such as, but not limited to, the HIV TAR element, intemal ribosome entry site, 
instability elements, and adenylate uridylate-rich elements, which are described in Section 
5.1. Non-lmiiting examples of target nucleic acids are presented in Section 5.1 and Section 
6. 

[0074] As used herein, the terms "therapeutic agent" and "therapeutic agents" refer to 
any agent(s) which can be used in the prevention, treatment, management or ameUoration of 
one or more symptoms of a particular disease or disorder. In certain embodiments, the term 
"therapeutic agent" refers to a compound identified in the screening assays described 
herein, hi other embodiments, the term "therapeutic agent" does not refer to a compound 
identified in the screemng assays described herein. 
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[0075] As used hearein, the term "therapeutically effective amount" refers to that amount 
of a therapy {e.g., a therapeutic agent) sufficient to reduce the severity of a disease or 
disorder, reduce the duration of a disease or disorder, ameliorate of one or more symptoms 
of a disease or disorder, prevent advancement of a disease or disorder, cause regression of 
the disease or disorder, or to enhance or improve the therapeutic effect(s) of another 
therapeutic agent. In a specific embodiment, with respect to the treatment of cancer, a 
therapeutically effective amount refers to the amount of a therapy {e.g., a therapeutic agent) 
that inhibits or reduces the prohferation of cancerous cells, inhibits or reduces the spread of 
tumor cells (metastasis), inhibits or reduces the onset, development or progression of one or 
more symptoms associated with cancer, or reduces the size of a tumor. Preferably,with 
respect to the treatment of cancer, a therapeutically effective of a therapy {e.g., a therapeutic 
agent) reduces the prohferation of cancerous cells or the size of a tumor by at least 5%, 
preferably at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, 
at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 
70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% 
relative to a control (e.g., PBS) in an assay described herein or well-known in the art. 
[0076] In another embodiment, with respect to the treatment of a viral infection, a 
therapeutically effective amount refers to the amount of a therapy {e.g., a therapeutic 
agent)sufficient to reduce or inhibit the replication of a virus, inhibit or reduce the spread of 
the virus to other tissues or subjects, or ameliorate one or more symptoms associated with 
the viral infection. Preferably, with respect to a viral infection, a therapeutically effective 
amount of a therapy (e.g., a therapeutic agent) reduces the rephcation or spread of a virus 
by at least 5%, preferably at least 10%, at least 15%, at least 20%, at least 25%, at least 
30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at 
least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, 
or at least 99%. relative to a contirol (e.g., PBS) in an assay described herein or well-known 
in the art. 

[0077] In another embodiment, with respect to the treatment of a fungal infection, a 
therapeutically effective amount refers to the amount of a therapy (e.g., a therapeutic agent) 
sufficient to inhibit or reduce the replication of the fungus, inhibit or reduce the rephcation 
or spread of the fungus to other tissues or subjects, or aanehorate one or more symptoms 
associated with the fungal infection. Preferably,with respect to a fungal mfection, a 
tlierapeutically effective amount of a therapy (e.g. , a therapeutic agent) reduces the spread 
of a fimgus by at least 5%, preferably at least 10%, at least 15%), at least 20%, at least 25%, 
at least 30%), at least 35%), at least 40%, at least 45%, at least 50%, at least 55%, at least 
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60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at 
least 95%, or at least 99% relative to a control (e.g., PBS) in an assay described herein or 
well-known in the art. 

[00781 hi another embodiment, with respect to the treatment of a bacterial infection, a 
therapeutically effective amount refers to the amount of a therapy (e.g., a therapeutic agent) 
sufficient to inhibit or reduce the rephcation of the bacteria, to inhibit or reduce the 
replication or spread of the bacteria to other tissues or subjects, or ameUorate one or more 
symptoms associated with the bacterial infection. Preferably, with respect to a bacterial 
mfection, a therapeutically effective amount of a therapy (e.g, a therapeutic agent) reduces 
the spread of abacteria by at least 5%, preferably at least 10%, at least 15%, at least 20%, at 
least 25%., at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, 
at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 
90%, at least 95%, or at least 99% relative to a control (e.g., PBS) m an assay described 
herein or well-known in the ait. 

[0079] hi another embodiment, with respect to the treatment of an inflammatory 
disorder, a therapeutically effective amount refers to the amount of a therapy (e.g., a 
therapeutic agent) that reduces the inflammation of a joint, organ or tissue. Preferably, with 
respect to an inflammatory disorder, a therapeutically effective amount of a therapy (e.g, a 
therapeutic agent) reduces the inflammation of a jomt, organ or tissue by at least 5%, 
preferably at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, 
at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 
70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% 
relative to a control (e.g., PBS) in an assay described herein or well-known in the art. 
[0080] As used herein, the terms "therapies" and "therapy" can refer to any protocol(s), 
method(s), and/or agent(s) that can be used in the prevention, treatment, management, or 
amehoration of a disease or disorder or one or more symptoms thereof 
[0081] As used herein, the terms "treat", "treatment" and "ti-eating" refer to the 
reduction or amehoration of the progression, severity and/or duration of a disease or 
disorder or one or more symptoms thereof resulting from the administration of one or more 
compomads identified in accordance the methods of the invention, or the administration of a 
combination of therapies (e.g., a compound identified in accordance with the methods of the 
invention and another therapeutic agent), hi certain embodiments, such terms refer to the 
inhibition or reduction in the proliferation of cancerous cells, the inhibition or reduction the 
spread of hmior cells (metastasis), the inhibition or reduction in the onset, development or 
progression of one or more symptoms associated with cancer, or the reduction in the size of 
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a tumor. In other embodiments, such terms refer to the reduction or inhibition of the 
rephcation of a virus, the inhibition or reduction in the spread of a virus to other tissues or 
subjects, or the amehoration of one or more symptoms associated with a viral infection, hi 
other embodiments, such terms refer to the reduction or inhibition of the rephcation of a 
flingus, the reduction or inhibition in the spread of a fungus to other tissues or subjects, or 
the amehoration of one or more symptoms associated with a fungal infection, hi other 
embodiments, such terms refer to the inhibition or reduction of the rephcation of a bacteria, 
the inhibition or reduction in the spread of a bacteria to other tissues or subjects, or the 
amehoration of one or more symptoms associated with a bacterial infection. In other 
embodiments, such terms refer to a reduction in the swelling of one or more joints, organs 
or tissues, or a reduction in the pain associated with an inflammatory disorder. 
[0082] As used herein, the tenn "UTR" refers to the untranslated region of a mRNA, 
i.e., the region of the mENA that is not translated into protein, hi a preferred embodiment, 
the UTR is a 5' UTR, i.e., upstream of the coding region, or a 3' UTR, i.e., downstream of 
the coding region, hi another embodiment, the term UTR corresponds to a reading frame of 
the mRNA that is not translated, hi another embodiment, a UTR contains a fragment of an 
untranslated region of a mRNA. hi a preferred embodiment, a UTR contains one or more 
regulatory elements that modulate unfranslated region-dependent regulation of gene 
expression. 

[0083] As used herein, the tenn "uORF" refers to an upstream open reading frame that 
is in the 5' UTR of the main open reading frame, i.e., that encodes a functional protein, of a 
mRNA. 

[0084] As used herein, the tenn "untranslated region-dependent expression" or "UTR- 
dependent expression" refers to the regulation of gene expression tlirough unfranslated 
regions at the level of mRNA expression, i.e., after franscription of the gene has begun until 
the protein or the RNA product(s) encoded by the gene has degraded, fri a preferred 
embodiment, the terra "unfranslated region-dependent expression" or "UTR-dependent 
expression" refers to the regulation of niRNA stability or translation. In a more preferred 
embodiment, the term "unfranslated region-dependent expression" refers to the regulation 
of gene expression through regulatory elements present in an untranslated region(s). 

4. DESCRIPTION OF DRAWINGS 

[0085] FIGS. lA-lB: Schematic representation of the VEGF 5'- and 3'-UTRs 

generated by PGR. A. VEGF 5 'UTR was amphfied from human genomic DNA by two 

separate PGR reactions. 5'UTRl, from position 337 to the 3' end plus first 45 nucleotides 

of VEGF open reading frame, was generated usmg primers 3 and 4. 5'UTR2, covered from 
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position 1 to 498, was generated with primers 1 and 2. In the overlap region of 5'UTRl and 
5'UTR2, ibe unique enzyme site BamH I was used to assemble the full length 5 'UTR in the 
subsequent cloning. B. The full length VEGF 3 'UTR was directly amphfied from genomic 
DNA using primers 5 and 6. The two enzyme sites close to 5' end and 3' end of 3 'UTR 
(Bgl n and EcoR I) were used for subsequent cloning. 

[0086] FIGS. 2A-2C: Identification of VEGF IRES domain in the VEGF mRNA 
5 'UTR. A. Dual luciferase vector used for mapping IRES function (Grentzmann et al., 
1998, RNA 4:479-486). B. Schematic representation of the dicistronic plasmids used for 
transfection experiments. P21uc/vegf5utrl is the dicistronic plasmid containing the VEGF 
5'UTRl, in which nucleotides 337 to 1083 of the VEGF cDNA were fused to the firefly 
luciferase coding sequence; P21uc/vegf5utr-fl was generated by subcloning VEGF 5'UTR2 
into the plasmid p21uc/vegf5utrl between Sal I and BamH I; plasmid p21uc/vegf5'utr- 
deltaS 1-476 is derived from p21uc/vegf5'utr-fl by removing the Nhe I fr agment (nt 51 to 
746); plasmid p21uc/vegf5utr-delta476-1038 was derived from p21uc/vegf5ufr-fl by 
removing the sequence from BamH I site to the 3'end of 5 'UTR; plasmid p21uc/vegf5utr- 
deltal-476 was derived from p21uc/vegf5utr-fl by removing the sequence fi-om BamH I to 
the 5 'end of 5 'UTR. P21uc-e used as negative control in this study. C. The constructs 
depicted in panel A were fransfected into 293T cells in the tripUcate format and expression 
was analyzed by monitoring luciferase activity. 

[0087] FIGS . 3 A-3B : Generation of stable cell lines for cell based high tlu-ou ghput 
screening ("HTS"). A. Schematic representation of the monocistronic plasmid used in this 
study for generation of stable cell lines. B. Screening of stable cell lines. The plasmid 
depicted in panel A was transfected into 293T cells. 48 hours later, the transfected cells 
were seeded in 96 well plates at 100-500 cells per well and 200 mg/ml hygromycin was 
added for selection. The culture media plus hygromycin was changed every 3 to 4 days. 
After 2 weeks of selection, cells were screened under a microscope and single colony wells 
were expanded for fiirther luciferase assays. The chart in panel shows the luciferase 
activities for 19 stable clones. 

[0088] FIG. 4: Side by side comparison of luciferase activities for three stable clones 
(B9, D3 and H6). For each cell line, 5x10^ cells per well were seeded in 24 well plate. 48 
hours later, cells were lysed and assayed for luciferase activities. The luciferase activities 
were normaUzed against protein concentration. 

[0089] FIG. 5 : Sustained high expression of luciferase by B9 cells. B9 cells were 
contmuously cultured m vitro for more than 3 months. At the time points indicated, 
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luciferase activity was tested with Promega's Bright Glow substrate aud normalized against 
the protein concentration. 

[0090] FIGS. 6A-6B: Reporter gene integration in B9 cells. The integration levels of 
the reporter gene were determined using semi-quantitative PGR. Series diluted plasmid 
pluc5'+3'vegf-UTR were included as positive control to make sure the reaction for sample 
(genomic DNA from B9 cells) was in the linear range, i.e., not saturated. Panel A shows 
the PGR results for sample and positive control. The PGR band intensity for each reaction 
is at the bottom of the picture. Panel B shows the PGR standard curve, plotted with PGR 
band intensity against the amount of positive control plasmid loaded for PGR. 
[0091] FIGS. 7A-7B : The 5 ' UTR of survivin can function as an internal ribosome 
entiy site (IRES). A. Firefly luciferase assays on 293T cells transiently transfected with 
the survivin expression vectors in the absence of a stem-loop secondary structure. "5' + 3' 
UTR" represents the survivin expression vector containing the fnefly luciferase reporter 
gene surrounded by both the 5' and 3' untranslated regions of survivin. "5' UTR" 
represents the survivin expression vector containing the firefly luciferase reporter gene 
preceded only by the 5' UTR of survivin. "3' UTR" represents the survivin expression 
vector containing the firefly luciferase reporter gene followed only by the 3' UTR of 
survivin. "no UTR" represents the survivin expression vector containing the firefly 
luciferase reporter gene lacking any surrounding untranslated regions of survivin. The 
survivin expression vectors were transiently transfected into 293T cells in duplicate 
(represented by the two bars for each construct in the graph) and firefly luciferase activity 
(measured in quadruplicate) was normalized to total protein concentration in each of the 
cell lysates. B. As in FIG. 7A, except that the survivin expression vectors contaming the 
stem-loop secondary structure to separate cap-dependent from cap-independent tianslation 
were used. 

[0092] FIGS. 8A-8C: Expression Vectors. A. Schematic representation of pGMRl, a 
high-level stable and transient mammalian expression vector designed to randomly integrate 
into the genome. B. Schematic representation of pMGPl, a high level stable and transient 
mammalian expression vector designed to site-specifically integrate into the genome of 
cells genetically engineered to contain the FRT site-specific recombination site via the Flp 
recombinase (see, e.g., Graig, 1988, Ann. Rev. Genet. 22: 77-105; and Sauer, 1994, Gurr. 
Opin. Biotechnol. 5: 521-527). C. Schematic representation of pCMR2, an episomal 
mammahan expression vector. 

5. DETAILED DESCRIPTION OF THE INVENTION 
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[0093] The present invention provides methods for identifying compounds that 
modulate the untranslated region-dependent expression of any target gene, hi particular, the 
invention provides simple, rapid and sensitive methods for identifying compounds that 
modulate untranslated region-dependent expression of a target gene utihzing reporter gene- 
hased constructs comprising one or more mRNA untranslated regions ("UTRs") of the 
target gene. The reporter gene-based assays described herein can be utilized in a high 
throughput format to screen libraries of compounds to identify those compoimds tliat 
modulate untranslated region-dependent expression of a target gene. 
[0094] The reporter gene-based assays of the invention reduce the bias introduced by 
competitive binding assays which require the identification of use of a host cell factor 
(presumably essential for modulating RNA function) as a binding partner for the target 
RNA. The reporter gene-based assays of the invention are designed to detect any 
compound that modulates untranslated region-dependent expression of a target gene under 
physiologic conditions. 

[0095] The reporter gene-based assays may be conducted by contacting a compound 
with a cell genetically engineered to express a nucleic acid comprising a reporter gene 
operably linked to one or more untranslated regions (preferably, the 5' and/or 3' UTRs) of a 
target gene, and measuring the expression of said reporter gene. Alternatively, the reporter 
gene-based assays may be conducted by contacting a compound with a cell-free translation 
mixture and a nucleic acid comprising a reporter gene operably linked to one or more 
untranslated regions of a target gene, and measuring the expression of said reporter gene. 
The alteration in reporter gene expression relative to a previously determined reference 
range or a control in such reporter-gene based assays indicates that a particular compound 
modulates untranslated region-dependent expression of a target gene. In order to exclude 
the possibility that a particular compound is functioning solely by modulating the 
expression of a target gene in an untranslated region-independent manner, one or more 
mutations (i.e., deletions, insertions, or nucleotide substitutions) maybe introduced into the 
untranslated regions operably linked to a reporter gene and the effect on the expression of 
the reporter gene in a reporter gene-based assay described herein can be determined. 
[0096] The compounds identified in the reporter gene-based assays described herein 
that modulate untranslated region-dependent expression may be tested in in vitro assays 
(e.g., cell-free assays) or in vivo assays (e.g., cell-based assays) well-known to one of skill 
in the art or described herein for the effect of said compounds on the expression of the 
target gene from which the unfranslated regions of the reporter gene construct were derived. 
Further, the specificity of a particular compound's effect on untranslated region-dependent 
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expression of one or more other genes (preferably, a plurality of genes) can be determined 
utilizing assays well-known to one of skill in the art or described herein. In a preferred 
embodiment, a compound identified utilizing the reporter gene-based assays described 
herein has a specific effect on the expression of only one gene or a group of genes within 
the same signaling pathway. 

[0097] The structure of the compounds identified in the reporter gene-based assays 
described herein that modulate untranslated region-dependent expression can be determined 
utilizing assays well-known to one of skill in the art or described herein. The methods used 
will depend, in part, on the nature of the library screened. For example, assays or 
microarrays of compounds, each having an address or identifier, may be deconvoluted, e.g., 
by cross-referencing the positive sample to an original compound list that was applied to tlie 
individual test assays. Alternatively, the stmcture of the compounds identified herein may 
be determined using mass spectrometry, nuclear magnetic resonance ("NMR"), X ray 
crystallography, or vibrational spectroscopy. 

[0098] The invention encompasses the use of the compounds identified in accordance 
with the methods described herein for the modulation {i.e., upregulation or downregulation) 
of the expression of a target gene. The upregulation or dowm-egulation of a target gene is 
particulaiiy useful m vitro when attempting to produce a protein encoded by said target 
gene for use as a therapeutic or prophylactic agent, or in experiments conducted to, e.g., 
identify the function or efficacy of said protein. The invention also encompasses the use of 
the compounds identified in accordance with the methods described herein for the 
prevention, treatment or amelioration of a disease or disorder or a s>mptom thereof. 
Examples of diseases and disorders which may be prevented, treated or ameliorated 
utihzing a compound identified in accordance with the invention include, but are not limited 
to, proliferative disorders, disorders associated with aberrant angiogenesis, inflammatory 
disorders, infectious diseases, genetic disorders, autoimmune disorders, cardiovascular 
diseases, and central nervous system disorders, hi an embodiment wherein the disease or 
disorder is an infectious disease, the infectious disease can be caused by a fungal infection, 
a bacterial infection, a viral infection, or an infection caused by another type of pathogen. 

5.1. Untranslated Regions 

[0099] Any untranslated region may be utilized in the reporter gene constructs 

described herein. An untranslated gene region(s) may be obtained or derived from a gene 

from any species, including, but not limited to, plants (e.g., soybean, canola, cotton, wheat, 

com, rice, potato, and tomato plants), viruses, bacteria, flrngus and animals (including, but 

not limited to, mammals (primates and non-primates), farm animals (e.g., horses, pigs, 
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COWS, aonkeys, etc.), pets (e.g., guinea pigs, cats, and dogs), and humans). Untranslated 
regions may be obtained and the nucleotide sequence of the untranslated regions determined 
by any method well-known to one of skill in the art. The nucleotide sequence of an 
untranslated region for a target gene can be obtained, e.g., from the hterature or a database 
such as GenBank. Alternatively, the nucleotide sequence of the untranslated regions of a 
target gene may be generated from nucleic acid from a suitable source. If a clone 
containing a nucleic acid of an untranslated region of a target gene is not available, but the 
sequence of the untranslated region is known, a nucleic acid of the untranslated region may 
be chemically synthesized or obtained from a suitable source (e.g., a cDNA library) by PGR 
ampUfication. Once the nucleotide sequence of an unfranslated region is determined, the 
nucleotide sequence of the untranslated region may be manipulated using methods well- 
Icnown in the art for the manipulation of nucleotide sequences, e.g., recombmant DNA 
techniques, site directed mutagenesis, PGR, etc. (see, for example, the techniques described 
in Sambrook et al., 1990, Molecular Gloning, A Laboratory Manual, 2d Ed., Gold Spring 
Harbor Laboratory, Gold Spring Harbor, NY and Ausubel et al, eds., 1998, Gurrent 
Protocols in Molecular Biology, John Wiley & Sons, NY, which are both incorporated by 
reference herein in their entireties), to generate an unfranslated region having a different 
nucleotide sequence. 

[00100] In one embodiment, an untranslated gene region(s) is obtained or derived from a 
gene whose expression is associated with or has been linked to the onset, development, 
progression or severity of a particular disease or disorder, hi another embodiment, an 
untranslated gene region(s) is obtained or derived from a gene whose expression is 
beneficial to a subject with a particular disease or disorder. Examples of genes from which 
the unfranslated regions may be obtained or derived from include, but are not limited to, 
cytokines, cytokine receptors, T cell receptors, B cell receptors, co-stimulatory molecules, 
clotting cascade factors, cyclins, cyclin inhibitors, oncogenes, growth factors, growth factor 
receptors, tumor suppressors, apoptosis inhibitor proteins, cell adhesion molecules, 
hormones, GXP -binding proteins, glycoproteins, ion chamiel receptors, calcium channel 
pumps, steroid receptors, opioid receptors, sodium channel pimips, heat shock proteins, 
MHC proteins, and tumor-associated antigens ("T^\As"). 

[00101] Specific examples of genes from which the untranslated regions may be 
obtained or derived from include, but are not limited, to the gene encoding abl, the gene 
encoding acetyl CoA carboxylase beta ("AGG2"; see, e.g., OMIM accession number 
601557, locus link accession number 32), the gene encoding acetylcholinesterase ("AGHE"; 
see, e.g., OMIM accession number 100740, locus Unk accession number 43, GenBank 
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accession number NM 0006 65), the gene encoding actin, alpha cardiac ("ACTC"; see, e.g., 
OMM accession number 102540, locus link accession number 70), the gene encoding acyl- 
CoA dehydrogenase ("ACADVL"; see, e.g., OMIM accession number 201475, locus link 
accession number 37), the gene encoding adiponecthi ("ACRP30"; see, e.g., OMIM 
accession number 605441, locus Unk accession number 9370, GenBank accession number 
MM 0047 97), the gene encoding ADP-ribosylation factor-4 ("ARF4"; see, e.g., OMIM 
accession number 601 177, locus link accession number 378, GenBank accession number 
NM 0017 ev 25), the gene encoding alpha-glucosidase, the gene encoding Alzheimer's 
disease amyloid A4 ("APP" or "A4" or "CVAP" or "ADl"; see, e.g., OMIM accession 
number 104760, locus link accession number 351), the gene encoding angiogenin ("ANG" 
or "RNASE5"; see, e.g., OMIM accession number 105850, locus link accession number 
283, GenBank accession number NM 0011 45), the gene encoding angiopoietinl ("ANGl"; 
see, e.g., OMM accession number 601667, locus link accession number 284), the gene 
encoding angiopoietin2 ("ANG2"; see, e.g., OMIM accession number 601922, locus linlc 
accession number 285), the gene encoding angiostatin, the gene encoding angiotensin 1- 
converting enzyme ("DCPl"; see, e.g., OMM. accession number 106180, locus link 
accession number 1636), the gene encoding antigen CD82 ("KAIl"; see, e.g, OMM 
accession number 600623, locus hnlc accession number 3732, GenBank accession number 
NM 0022 31), the gene encoding APC, the gene encoding atrial natriuretic factor, the gene 
encoding bactericidal/permeability-increasing protein ("BPI"; see, e.g, OMM accession 
number 109195, locus link accession number 671, GenBank accession number NM 0017 ev 
25), the gene encoding bcl-2, the gene encodmg beta-catenin ("CTNNBl"; see, e.g., OMM 
accession number 116806, locus link accession number 1499), the gene encoding beta-site 
APP-cleaving enzyme 2 ("BASE2"; see, e.g., OMM accession number 605668, locus link 
accession number 25825, GenBanlc accession number NM 1389 92), the gene encoding bile 
salt export pump ("ABCBl 1"; see, e.g., OMM accession number 603201, locus linlc 
accession number 8647), the gene encoding BMP, the gene encoding BNDF, the gene 
encoding bombesin receptor, the gene encoding brcal, the gene encoding brca2, the gene 
encoding Clq complement receptor (see, e.g., OMM accession number 120577, locus link 
accession number 22918), the gene encoding c-fins, the gene encoding c-myc, the gene 
encoding calcitonin, the gene encoding calcium-binding protein m macrophages ("MRP 14"; 
see, e.g., OMM accession number 123886, locus link accession number 6280, GenBank 
accession number NM 0029 ev 65), the gene encoding calsenilin ("DREAM/CSEN" or 
"CREAM" or "KCh IP3"; see, e.g., OMM accession number 604662, locus link accession 
number 30818, GenBank accession number NM 0134), the gene encoding carnitine o- 
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palmitoyltransferase ("CPT2"; see, e.g., OMM accession number 600650, locus link 
accession number 1376), the gene encoding catechol-o-methyltransferase ("COMT"; see, 
e.g., OMM accession number 116790, locus link accession number 1312, GenBank 
accession number m/L 000754, NM 0073 10), the gene encoding cathepsin K, the gene 
encoding CD40 ligand ("TNFSF5"; see, e.g., OMIM accession number 300386, locus hnk 
accession number 959), the gene encoding cdk4 inhibitor, the gene encoding chemokine (C- 
C) receptor ("IL13R"; see, e.g., OMIM accession number 601268, locus link accession 
number 1232), the gene encoding chemokine (C-X3-C) receptor 1 ("CX3CR1"; see, e.g., 
OMM accession number 601470, locus link accession number 1524), the gene encoding 
CLCAhomolog ("hCLCA2"; see, e.g, OMIM accession number 604003, locus link 
accession number 9635, GenBanlc accession number NM 0065 36), tlie gene encoding 
complement decay-accelerating factor ("DAF/CD55"; see, e.g., OMIM accession number 
125240, locus link accession number 1604), the gene encoding connective tissue growth 
factor ("CTGF"; see, e.g., OMM accession number 121009, locus link accession number 
1490), the gene encoding corticotrophin releasuig factor, the gene encoding CTLA4, the 
gene encoding cyclm Dl, the gene encoding cyclm E, the gene encoding cyclin Tl (see, 
e.g., OMM accession number 602506, locus link accession number 904, GenBank 
accession number NM 0012 40), the gene encoding cyclin-dependent kinase inhibitor lA 
("p21" or "WAFl" or "CDKNIA" or "Cipl"; see, e.g., OMM accession number 116899, 
locus link accession number 1026, GenBanlc accession number NM 0784 67), the gene 
encoding cyclin-dependent kinase inhibitor 2A ("CDKN2A"; see, e.g, OMM accession 
number 600160, locus Unk accession number 1029), the gene encoding cystic fibrosis 
transmembrane conductance regulator ("CFTR"), the gene encoding cytoclii-ome P-450, the 
gene encoding D-1 dopamine receptor ("DRDl"; see, e.g., OMM accession number 
126449, locus Unk accession number 1812, GenBanlc accession number NM 00794, 
X589987), the gene encoding D-amino-acid oxidase ("DAO"; see, e.g., OMM accession 
number 124050, locus link accession number 1610, GenBank accession number NM 0019 
17), the gene encoding damage specific DNA binding protein ("DDBl"; see, e.g, OMM 
accession number 600045, locus linlc accession number 1642), the gene encoding DCC, the 
gene encoding desmoglein 1 ("DSGl"; see, e.g, OMM accession number 125670, locus 
link accession number 1828), the gene encoding a dihydrofolate reductase ("DHFR"; see, 
e.g., OMM accession number 126060, locus linlc accession number 1719, GenBank 
accession number NM 0007 91), the gene encoding a disintegrin and metallo proteinase 
domain 33 ("ADAM 33"; see, e.g., OMM accession number 607114, locus link accession 
number 80332), the gene encoding DNA methyltransferase ("DNMT3b"; see, e.g, OMM 
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accession number 602900, locus link accession number 1789), the gene encoding DPP-IV, 
the gene encoding drebrin-l dendritic spine protein ("DBNl"; see, e.g., OMIM accession 
number 126660, locus link accession number 1627, GenBank accession number NM 
004395, NM 080881), the gene encoding E-cadherin, the gene encoding effector cell 
protease receptor ("EPRl"; see, e.g., OMIM accession number 60341 1, locus link accession 
number 8475), the gene encoding EGF, the gene encoding EGFR (see, e.g., OMIM 
accession number 131550, locus link accession number 1956), the gene encoding an EGFR 
subunit, the gene encoding EIF4BP (see, e.g., OMIM accession number 602223, locus link 
accession number 1978, GenBank accession number NM 0040 95), the gene encoding 
EMMPRIN (see, e.g., OMIM accession number 109480, locus link accession number 682, 
GenBank accession number NM 0017 28), the gene encoding emotakin ATP-binding 
cassette, sub-family a, member 1 ("ABCAl"; see, e.g., OMIM accession number 600046, 
locus linlc accession number 19), the gene encoding endostatin, the gene encoding eotaxin 
("CCLl 1"; see, e.g., OMIM accession number 601 156, locus link accession nxmiber 6356, 
GenBank accession number NM 0029 86), the gene encoding erythropoietin ("EPO"; see, 
e.g., OMIM accession number 133170, locus link accession number 2056, GenBank 
accession number NM 0007 99), the gene encoding estrogen receptor, the gene encoding 
factor IX, the gene encoding factor VIII, the gene encoding famesyl transferase, the gene 
encoding FGF, the gene encoding FGFl (see, e.g., OMIM accession number 131220, locus 
linlc accession number 2246, GenBank accession number ), the gene encoding FGF2 (see, 
e.g., OMIM accession number 134920, locus link accession number 2247, GeriBank 
accession number NM 0020 06), the gene encoding FGFR, the gene encoding fibrillin 
("FBNl"; see, e.g., OMIM accession number 134797, locus link accession number 2200), 
the gene encoding FMS-related tyrosine kinase 1 ("FLTl"; see, e.g., OMIM accession 
number 165070, locus linlc accession number 2321, GenBank accession number NM 0020 
ev 19), the gene encoding forldiead box C2 ("FOXC2"; see, e.g., OMIM accession number 
602402, locus link accession number 2303, GenBanlc accession number NM 0052 51), the 
gene encoding fos (see, e.g., OMIM accession number 164810, locus link accession number 
2353, GenBank accession number NM 0052 52), the gene encoding G-CSF, the gene 
encoding G-CSF 3 C'CSF3"; see, e.g., OMIM accession number 138970, locus link 
accession number 1440), the gene encoding a GABA receptor, the gene encoding galanin 
("GAL"; see, e.g., OMIM accession number 137035, locus link accession number 2586), 
the gene encoding gastric inhibitory polypeptide ("GP"; see, e.g., OMIM accession number 
137240, locus link accession number 2695), the gene encoding GDNF, the gene encoding 
GGF, the gene encoding GGRP, the gene encoding ghrehn ("GHRL"; see, e.g., OMIM 
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accession number 60535i3, locus link accession number 51738), the gene encoding gip, the 
gene encoding glucagon, the gene encoding glucagon receptor ("GCGR"; see, e.g., OMEM 
accession number 138033, locus link accession number 2642), the gene encoding glucagon- 
like peptide-1 ("GLPl"; see, e.g., OMIM accession number 138030, locus link accession 
number 2641), the gene encoding glucokinase ("GCK"; see, e.g., OMIM accession number 
138079, locus link accession number 2645, GenBank accession number NM 0001 62), the 
gene encoding glutamic acid decarboxylase 2 (see, e.g., OMIM accession number 138275), 
the gene encoding glutamic acid decarboxylase 3 (see, e.g., OMIM accession number 
138276), the gene encoding glutamic acid decarboxylase, brain, membrane form (see, e.g., 
OMIM accession number 138277), the gene encoding glycogen synthase kinase-3A ("GSK- 
3 A"; see, e.g., OMIM accession number 606784, locus link accession number 2931), the 
gene encoding glycogen synthase kinase-3B ("GSK-3B"; see, e.g., OMIM accession 
number 605004, locus linlc accession number 2932), the gene encoding GM-CSF (see, e.g., 
OMIM accession niunber 138960, locus link accession number 1437), the gene encoding 
gonadotropin, the gene encoding gonadotropin releasing hormone, the gene encoding 
GR02 oncogene or macrophage inflammatory protein-2-alpha precursor ("CXCL2"; see, 
e.g., OMIM accession number 1391 10, locus link accession number 2920), the gene 
encoding growth honnone releasing factor, the gene encoding gro\¥th hormone, the gene 
encoding gsp, the gene encoding H-ras, the gene encoding heat shock protein ("HSP")-70, 
the gene encoding heparanase ("HP A"; see, e.g., OMIM accession number 604724, locus 
link accession number 10855), the gene encoding hepatitis A vims cellular receptor 
("HA VCR"; see, e.g., OMIM accession number 606518, locus link accession number 
26762), the gene encoding hepatitis B virus X interacting protein ("HBXIP"), the gene 
encoding hepsin ("HPN"; see, e.g., OMIM accession number 142440, locus link accession 
number 3249, GenBank accession number NM 0021 51), the gene encoding Her-2 
("ERBB2"; see, e.g., OMIM accession number 164870, locus link accession number 2064), 
the gene encoding HGF, the gene encoding histone acetyltransferase ("HATl"; see, e.g., 
OMIM accession number 603053, locus linlc accession number 8520, GenBank accession 
number NM 0036 42), the gene encoding histone deacetylase 1 ("HDACl"; see, e.g., 
OMIM accession number 601241, locus link accession number 3065), the gene encoding 
histone deacetylase 3 ("HDAC3"; see, e.g., OMIM accession number 605166, locus link 
accession number 8841, GenBank accession number NM 0038 ev 83), the gene encoding 
HIV Tat Specific Factor 1 ("HTATSFl"; see, e.g., OMIM accession number 300346, locus 
link accession number 27336), the gene encoding HMG CoA synthetase, the gene encoding 
HSP-90, the gene encoding huntingtin ("HD"; see, e.g., OMIM accession number 143 100, 
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locus linK accession number 3064, GenBank accession number NM 0021 1 1), the gene 
encoding Hu antigen R ("HUR"; see, e.g., OMIM accession number 603466, locus link 
accession number 1994, GenBank accession number NM 0014 19), the gene encoding 3- 
hydroxy-3-methylglutaryl-CoA reductase ("HMGCR"; see, e.g., OMIM accession number 
142910, locus link accession number 3156), the gene encoding hypoxia-inducible factor 1 
("HIF-IA"; see, e.g., OMIM accession number 603348, locus link accession number 3091), 
the gene encoding hypoxia-inducible factor 1-alpha inhibitor ("HIFIAN"; see, e.g., OMIM 
accession number 606615, locus link accession number 55662), the gene encoding 
iduronate 2-sulfatase ("IDS"; see, e.g., OMIM accession number 309900, locus link 
accession number 3423), the gene encoding IGF-1 (see, e.g., OMIM accession number 
147440, locus link accession number 3486), the gene encoding IGF-IR (see, e.g., OMIM 
accession number 147370, locus link accession number 3480, GenBank accession number 
NM 0008 ev 75), the gene encoding IGF-2, the gene encoding IGF binding protein-2 
("IGFBP2"; see, e.g., OMIM accession number 146731, locus link accession number 3485), 
the gene encoding DcB kinase ("IKBKB"; see, e.g., OMIM accession number 603258, locus 
link accession number 3551), the gene encoding inositol pholyphosphate phosphatase-like 1 
("SHIP-2"; see, e.g., OMM accession number 600829, locus link accession number 3636, 
GenBank accession number NM 0015 67), the gene encoding insulin, the gene encoding 
interferon inducible protein ("CXCLIO (IPIO)"; see, e.g., OMM accession number 147310, 
locus link accession number 3627, GenBank accession number NM 0015 65), the gene 
encoding interferon ("IFN")-o; the gene encoding interferon-O! 1/13 precursor, the gene 
encoding interferon-o: 5 precursor ("IFNA5"; see, e.g., OMIM accession number 147565, 
locus link accession number 3442), the gene encoding interferon-a- 16 precursor 
("IFNA16"; see, e.g., OMM accession number 147580, locus link accession number 3449), 
the gene encoding IFN-^, the gene encoding IFN-/3 1 ("IFNBl"; see, e.g., OMM accession 
number 147640, locus link accession number 3456), the gene encoding IFN-7(see, e.g., 
OMM accession number 147440, locus link accession number 3479), the gene encoding 
insulin receptor ("INSR"; see, e.g., OMM accession number 147670, locus link accession 
number 3643, GenBanlc accession number NM 0002 08), the gene encoding interleukin-1 b 
("ILIB"; see, e.g., OMM accession number 147720, locus link accession number 3553), 
the gene encoding interleukin-2 ("IL-2"; see, e.g., OMM accession number 147680, locus 
link accession number 3558), the gene encoding interleukin-3 ("IL-3"), the gene encoding 
interleukin-4 ("IL-4"; see, e.g., OMM accession number 147780, locus link accession 
number 3565, GenBank accession number NM 0005 89), the gene encoding interleukin-4 
receptor ("IL4R"; see, e.g., OMIM accession number 147781, locus link accession number 
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35'66, GeiiBank accession number NM 0004 18), the gene encoding interleukin-S ("IL-5"), 
the gene encoding interleiikin-6 ("IL-6"; see, e.g., OMIM accession number 147620, locus 
link accession number 3569), the gene encoding interleuldn-7 ("IL-7"), the gene encoding 
interleukin-S ("IL-8"; see, e.g., OMIM accession number 146930, locus link accession 
number 3576), the gene encoding interleukin-9 ("IL-9"; see, e.g., OMIM accession number 
146931, locus link accession number 3578), the gene encoding interleuMn-lO ("IL-10"; see, 
e.g., OMIM accession number 124092, locus Unk accession number 3586, GenBahk 
accession number NM 0005 72), the gene encoding interleuldn-12 ("IL-12"), the gene 
encoding interleukin-12 beta chain precursor ("IL12B"; see, e.g., OMIM accession number 
161561, locus link accession number 3593), the gene encoding interleukin-13 ("IL-13"; see, 
e.g., OMIM accession number 147683, locus link accession number 3596, GenBank 
accession number NM 0021 88), the gene encoding iuterleuldn-lS ("IL-15"), the gene 
encoding interleukin-17F ("MLl"; see, e.g., OMIM accession number 606496, locus link 
accession number 11274), the gene encoding interleukin- 18 ("IL-18"; see, e.g., OMIM 
accession number 600953, locus link accession number 3606), the gene encoding 
INIl/hSNFS (see, e.g., OMIM accession number 601607, locus link accession number 
6598), the gene encoding jun, the gene encoding kallikrein 6 ("KLK6"; see, e.g., OMIM 
accession number 602652, locus link accession number 5653, GenBank accession number 
NM 0027 74), the gene encoding KGF, the gene encoding ki-ras, the gene encoding kit 
Hgand, stem cell factor ("KITLG (SCF)"; see, e.g, OMIM accession number 184745, locus 
Unk accession number 4254, GenBank accession number NM 0008 99), the gene encoding 
klotho ("KL"; see, e.g., OMIM accession number 604824, locus hnk accession number 
9365, GenBank accession number NM 0047 95), the gene encoding L-myc, the gene 
encoding large tumor suppressor ("LATS 1"; see, e.g., OMIM accession number 603473, 
locus link accession number 9113, GenBank accession number NM 0046 ev 90), the gene 
encoding LDL receptor ("LDLR"; see, e.g., OMIM accession number 606945, locus link 
accession number 3949, GenBank accession number NM 0005 27), the gene encoding 
leptin ("LEP"; see, e.g., OMIM accession number 164160, locus linlc accession number 
3952, GenBank accession number NM 0002 30), the gene encoding leptin receptor 
("LEPR"; see, e.g., OMIM accession number 601007, locus link accession number 3953), 
the gene encoding leucine amino peptidase-3 ("LAP3"; see, e.g., OMIM accession number 
606832, locus link accession number 51056), the gene encoding leukemia inhibitory factor 
("LBF"; see, e.g., OMIM accession number 159540, locus link accession number 3976), the 
gene encoding leukemia inhibitory factor receptor ("LIFR"; see, e.g., OMIM accession 
number 151443, locus hnk accession number 3977), the gene encoding linker for activation 
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Of 1 cells ("LAT"; see, e.g. , OMIM accession number 602354, locus link accession number 
27040), the gene encoding livin (see, e.g., OMIM accession number 605737, locus link 
accession number 79444, GenBank accession number NM 1393 ev 17), the gene encoding 
luteinizing hormone, the gene encoding luteinizing hormone releasing hormone, the gene 
encoding macrophage migration inhibitory factor ("MIF"; see, e.g., OMIM accession 
number 153620, locus link accession number 4282, GenBank accession number NM 0024 
15), the gene encoding major histocompatibility complex class I chain-related gene A 
("MICA"; see, e.g., OMIM accession number 600169, locus link accession number 4276, 
GenBank accession number KM 0002 47), the gene encoding major histocompatibility 
complex class I chaia-related gene B ("MICB"; see, e.g., OMIM accession number 602436, 
locus link accession number 4277, GenBank accession number NM 0059 3 1), the gene 
encoding matrix metalloproteinase 9 ("MMP9"; see, e.g., OMIM accession number 120361, 
locus link accession number 43 18), the gene encoding matrix metalloproteinase 12 
("MMP12"; see, e.g., OMIM accession number 601046, locus link accession nmnber 4321), 
the gene encoding max interacting protein 1 ("MXIl"; see, e.g., OMIM accession number 
600020, locus link accession number 4601), the gene encoding MCC, the gene encoding 
MDM2 (see, e.g., OMIM accession number 164785, locus hnk accession number 4193, 
GenBank accession number NM 0023 92), the gene encoding METH-1, the gene encoding 
METH-2, the gene encoding methyl-CpG-binding endonuclease ("MBD4"; see, e.g., 
OMIM accession number 603574, locus linlc accession number 8930, GenBank accession 
number NM 0039 ev 25), the gene encoding monoamine oxidase-A ("MAOA"; see, e.g., 
OMIM accession number 309850, locus link accession number 4128, GenBank accession 
nmnber NM 0002 ev 40), the gene encoding monoamine oxidase-B ("MAOB"; see, e.g., 
OMIM accession number 309860, locus link accession number 4129), the gene encoding 
monocyte chemotactic protein 1 ("MCPl"; see, e.g., OMIM accession number 158105, 
locus link accession number 6347), the gene encoding mos, the gene encoding MTSl, the 
gene encoding myc, the gene encoding myotrophin, the gene encoding N-acetyltransferase, 
the gene encoding N-cadherin, the gene encoding N-methyl D-aspartate ("NMD A") 
receptor, the gene encoding NAD(P)-dependent steroid dehydrogenase ("NSDHL"; see, 
e.g., OMIM accession number 300275, locus linlc accession number 50814), the gene 
encoding natural resistance-associated macrophage protein ("NRAM P"; see, e.g., OMIM 
accession number 600266, locus link accession number 6556), the gene encoding neural 
cell adhesion molecule 1 ("NCAMl"; see, e.g., OMIM accession number 116930, locus 
link accession number 4684), the gene encoding neuron growth associated protein 43 
("GAP-43"; see, e.g., OMIM accession number 162060, locus link accession number 2596), 
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ffie gene encoding NFl , the gene encoding NF2, the gene encoding NGF, the gene encoding 
a NGFR subunit, the gene encoding ran23, the gene encoding nuclear factor of kappa light 
polypeptide gene enhancer in B-cells 1 ("NFKBl"; see, e.g., OMIM accession number 
16401 1, locus link accession number 4790), the gene encoding OSM, the gene encoding 
osteopontin ("OPN"; see, e.g., OMM accession number 166490, locus link accession 
number 6696), the gene encoding P-glycoprotein-1 ("PGYl"; see, e.g., OMM accession 
number 171050, locus link accession number 5243, GenBank accession number NM 0009 
27), the gene encoding p38 MAP kinase ("p38" or "MAPK14"; see, e.g., OMM accession 
number 600289, locus link accession number 1432), the gene encoding p53, the gene 
encoding p300/CBP associated factor("PCAF"; see, e.g., OMIM accession number 602303, 
locus link accession number 8850), the gene encoding parathyroid hormone, the gene 
encoding PDGF, the gene encoding PDGF, beta chain ("PDGF2"; see, e.g., OMM 
accession number 190040, locus Hnk accession number 5155), the gene encoding a PDGFR 
subunit, the gene encoding peroxin-1 ("PEXl"; see, e.g., OMIM accession number 602136, 
locus link accession nmnber 5189), the gene encoding peroxisome assembly factor-2 
("PEX6"; see, e.g., OMIM accession number 601498, locus link accession number 5190), 
the gene encoding peroxisome proliferator- activated receptor-gamma ("PPARg"; see, e.g., 
OMIM accession number 601487, locus link accession number 5468), the gene encoding 
phenylalanine hydroxylase, the gene encoding phosphodiesterase, the gene encoding human 
phosphotyrosyl-protein phosphatase ("PTP-IB"; see, e.g., OMM accession number 
176885, locus link accession number 5770, GeiiBaiilc accession niunber NM 0028 27), the 
gene encoding placental growth factor ("PGF"; see, e.g., OMM accession number 601121, 
locus link accession number 5228, GenBank accession nmnber NM 0026 ev 32), the gene 
encoding plasminogen actiA'^ator inhibitor protein ("PAH"; see, e.g., OMM accession 
number 173360, locus link accession number 5054), the gene encoding pleiotrophin 
("PTN"; see, e.g., OMIM accession number 162095, locus link accession nmnber 5764), the 
gene encoding poly(rC) binding protein 2 ("PCBP2"; see, e.g., OMM accession number 
601210, locus link accession number 5094), the gene encoding progranulin ("PCDGF" or 
"GRN"; see, e.g., OMM accession number 138945, locus linlc accession number 2896), the 
gene encoding prolactin ("PRL"; see, e.g., OMM accession number 176760, locus link 
accession number 5617, GenBank accession number NM 0009 48), the gene encoding 
prohferating cell nuclear antigen ("PCNA"; see, e.g., OMM accession number 176740, 
locus link accession number 5111), the gene encoding protein kinase B/Akt ("AKTl"; see, 
e.g., OMM accession number 164730, locus link accession number 207), the gene 
encoding protein kinase C, gamma ("PKCg"; see, e.g., OMM accession number 176980, 
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locus link accession number 5582), the gene encoding protein-tyrosine phosphatase, 4A, 3 
("PTP4A3"; see, e.g., OMIM accession number 606449, locus link accession number 
11156, GenBank accessionnumber NM 0326 11), the gene encoding psoriasin ("PSORl"; 
see, e.g., OMIM accession number 600353, locus link accession number 6278, GenBank 
accession number NM 0029 63), the gene encoding ras, the gene encoding resistin ("Fizz3"; 
see, e.g., OMIM accession number 605565, locus link accession number 56729, GenBank 
accession number NM 0204 15), the gene encoding retinoblastoma ("Rb"; see, e.g., OMM 
accession number 180200, locus link accession number 5925, GenBank accession number 
NM 0003 21), the gene encoding retinoblastoma 1 ("Rbl"), the gene encoding 
retinoblastoma-binding protein 1-like 1 ("RBBPILI"; see, e.g., locus link accessionnumber 
51742), the gene encoding 5-a reductase, the gene encoding ribonuclease/angiogenin 
inhibitor ("RNH"; see, e.g., OMM accessionnumber 173320, locus link accessionnumber 
6050), the gerie encoding SlOO calcium-binding protein A8 ("MRP8"; see, e.g., OMIM 
accession number 123885, locus link accession number 6279, GenBaiik accession number 
NM 0029 ev 64), the gene encoding signal transducer and activator of transcription 6 
("STAT6"; see, e.g., OMIM accession number 601512, locus link accession number 6778), 
the gene encoding soluble-type polypeptide FZD4S ("FZD4S"; see, e.g., OMM accession 
number 604579, locus link accession number 8322), the gene encoding somatotrophin or 
somatotropin, the gene encoding src (see, e.g., OMIM accession number 190090, locus link 
accession number 6714, GenBank accession number NM 0054 ev 17), the gene encoding 
survivin, the gene encoding T-cell lymphoma invasion and metastasis 1 ("TIAMl"; see, 
e.g., OMIM accession number 600687, locus link accession nmnber 7074), the gene 
encoding TEK tyrosine kinase ("TIE2"; see, e.g., OMIM accession number 600221, locus 
link accession number 7010), the gene encoding telomerase, the gene encoding TGF-B, the 
gene encoding TGF-Bl (see, e.g., OMM accession number 190180, locus link accession 
number 7040), the gene encoding thrombomodulin ("THBD" or "THRM"; see, e.g., OMM 
accession number 188040, locus link accession number 7056), the gene encoding 
thrombopoietin ("THPO" or "TPO"; see, e.g, OMM accession number 600044, locus link 
accessionnumber 7066), the gene encoding human trisosephosphate isomerase ("TPIl"; 
see, e.g., OMM accession number 109450, locus link accession number 7167), the gene 
encoding thyroid hormone, the gene encoding thyroid stimulating hormone, the gene 
encoding tissue factor, the gene encoding tissue inhibitor of metalloprotease 1 ("TMPl"; 
see, e.g., OMM accession number 305370, locus link accession number 7076), the gene 
encoding tissue inhibitor of metalloprotease 2 ("TIMP2"; see, eg, OMM accession 
number 188825, locus link accession number 7077, GenBank accession number NM 0032 
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55}, the gene encoding tissue inhibitor of metalloprotease 4 ("TIMP4"; see, e.g., OMTM 
accession number 601915, locus link accession number 7079, GenBank accession number 
NM 0032 56), the gene encoding TNF-a (see, e.g., OMM accession number 191160, locus 
link accession number 7124), the gene encoding troponin T ("TnT"), the gene encoding 
uncoupling protein 2 ("UCP2"; see, e.g., OMM accession number 601693, locus link 
accession number 7351, GenBanlc accession number NM 0033 55), the gene encoding 
urokinase plasminogen activator ("uPA"; see, e.g., OMM accession number 191840, locus 
link accession number 5328), the gene encoding utrophin ("UTKN"; see, e.g., OMM 
accession muTiber 128240, locus link accession number 7402), the gene encoding v-myc 
myeloc5d:omatosis viral oncogene homolog ("c-MYC"; see, e.g., OMM accession number 
190080, locus link accession number 4609), the gene encoding vanilloid receptor subunit 1 
("VRl"; see, e.g., OMM accession number 602076, locus link accession number 7442, 
GenBank accession number NM 0187 ev 27, NM 08 0704, NM 0807 05, NM 0807 06), the 
gene encoding vascular endothelial growth factor ("VEGF"), the gene encoding virion 
infectivity factor ("VIF"), and the gene encoding VLA-4. 

[00102] In a specific embodiment, an untranslated region is obtained or derived from the 
gene encoding Her-2. In another embodiment, an untranslated region is not obtained or 
derived from the gene encoding Her-2. 

[00103] In one embodiment, an untranslated region is obtained or derived from the gene 
encoding VEGF. In another embodiment, an untranslated region is not obtained or derived 
from the gene encoding VEGF. 

[00104] The untranslated regions may be obtained or derived from the genome of any 
virus utilizing any method well-known to one of skill in the art. The nucleotide sequence of 
an untranslated region for a genome of a virus can be obtained, e.g., from the literature or a 
database such as GenBank. Examples of vimses from which the untranslated regions may 
be obtained or derived from include, but are not limited to, retro virsues (e.g., human 
immunodeficiency virus ("HIV") and human T cell leukemia virus ("HTLV"), 
herpesviruses (e.g., herpes simplex virus, epstein ban- virus and varicella zoster virus), 
reoviruses (e.g., reo virus and rotavirus), picomaviruses (e.g., poliovirus, rhinovims and 
hepatitis A virus), togaviruses (e.g., rubella vims), orthom>'xovirus (e.g., influenza virus), 
paramyxoviruses (e.g., measles virus, mumps vii-us, respiratory syncytical virus and 
parainfluenza virus), filoviruses (e.g., ebola virus and Marburg virus), rhabdoviruses (e.g., 
rabies virus), coronavfruses (e.g., coronavirus), rhinoviruses, hepatitis B virus, and hepatitis 
C virus. 
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[001 05] The untranslated regions may be obtained or derived from, the genome of any 
bacteiia utilizing any method well-known to one of skill in the art. The nucleotide sequence 
of an untranslated region for a genome of a bacteria can be obtained, e.g. , firom the literature 
or a database such as GenBank. Examples of bacteria from which the untranslated regions 
may be obtained or derived from include, but are not limited to, the Aquaspirillum family, 
Azospirillum family, Azotobacteraceae family, Bacteroidaceae family, Bartonella species, 
Bdellovihrio family, Campylobacter species. Chlamydia species {e.g., Chlamydia 
pneumoniae), Clostridium, Enterobacteriaceae jGamily {e.g., Citrobacter species, 
Edwardsiella, Enterobacter aerogenes, Erwinia species, Escherichia coli, Hafiiia species, 
Klebsiella species, Morganella species, Proteus vulgaris, Providencia, Salmonella species, 
Serratia marcescens, and Shigella flexneri), Gardinella family, Haemophilus influenzae, 
Halobacteriaceae family, Helicobacter family, Legionallaceae family. Listeria species, 
Methylococcaceae family, mycobacteria {e.g., Mycobacterium tuberculosis), Neisseriaceae 
family, Oceanospirillum family, Pasteurellaceae family, Pneumococcus species, 
Pseudomonas species, Rhizobiaceae family, Spirillum family, Spirosomaceae family. 
Staphylococcus {e.g., methiciUin resistant Staphylococcus aureus and Staphylococcus 
pyrogenes). Streptococcus {e.g., Streptococcus enteritidis. Streptococcus fasciae, and 
Streptococcus pneumoniae), VampirovibrHelicobacter family, and Vampirovibrio family. 
[00106] The untranslated regions may be obtained or derived from the genome of any 
fimgus utiUzing any method well-known to one of sldll in the art. The nucleotide sequence 
of an untranslated region for a genome of a ftmgus can be obtained, e.g. , from the literature 
or a database such as GenBank. Examples of frmgus from which the untranslated regions 
maybe obtained or derived from include, but are not limited to, Absidia species {e.g., 
Absidia corymbifera and Absidia ramosa), Aspergillus species, {e.g., Aspergillus flavus, 
Aspergillus fumigatus, Aspergillus nidulans, Aspergillus niger, and Aspergillus terreus), 
Basidiobolus ranarum, Blastomyces dermatitidis, Candida species {e.g., Candida albicans, 
Candida glabrata, Candida kerr, Candida h'usei, Candida parapsilosis, Candida 
pseudotropicalis, Candida quillermondii, Candida rugosa, Candida stellatoidea, and 
Candida tropicalis), Coccidioides immitis, Conidiobolus species, Cryptococcus neoforms, 
Cunninghamella species, dermatophytes, Histoplasma capsulatum, Microsporum gypseum, 
Mucor pusillus, Paracoccidioides brasiliensis, Pseudallescheria boydii, Rhinosporidium 
seeberi, Pneumocystis carinii, Rhizopus species {e.g., Rhizopus arrhizus, Rhizopus oryzae, 
and Rhizopus microsporus), Saccharomyces species, Sporothrix schenckii, zygomycetes, 
and classes such as Zygomycetes, Ascomycetes, tlie Basidiomycetes, Deuteromycetes, and 
Oomycetes. 
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[00107] The untranslated regions may be obtained or derived from the genome of any 
plant utilizing any method well-known to one of skill in the art. The nucleotide sequence of 
an unfranslated region for a genome of a plant can be obtained, e.g., from the literature or a 
database such as GenBank, EMBL, DDBJ, rice genome database, cotton.genome database 
and maize genome database. Examples of plants from which the untranslated regions may 
be obtained or derived from include, but are not limited to, soybean, canola, cotton, com, 
wheat, rice, tomato, and potato. Specific examples of plant genes from which an 
untranslated region may be obtained or derived from include, but are not limited to, triose 
phosphate, isomerase, fructose 1,6-bisphosphate adolase, fructose 1,6-bisphosphate, 
ftnctose 6-phosphate 2-kinase, phosphoglucoisomerase, pyrophsopliate-dependent fructose- 
6-phosphate phosphotransferase, vacuolar translocating-pyrophosphate, invertase, 
sucrose synthase, hexokinase, fructokinase, NDP -kinase, glucose-6-phosphate 1- 
dehydrogenase, phosphoglucomutase, UDP-glucose pyrophosphorylase, glutenin genes, cis- 
prenyltransferase, lipoxygenase, and soybean vestitone reductase (see, e.g., U.S. Patent 
Application Publication No. 2003/0135870 Al and U.S. Patent Nos. 6,638,5252, 6,645,747, 
6,627,797, and 6,617,493, which are incorporated herein by reference in its entirety). 
[00108] hi particular, a 5 ' UTR of a target gene, a 3 ' UTR of a target gene, or a 5 ' UTR 
and a 3 ' UTR of a target gene may be utilized m a reporter construct. In a specific 
embodiment, a 5' UTR of a target gene with a stable hairpin secondary structure is utilized 
in a reporter construct. In another specific embodiment, a reporter gene in the reporter 
construct contains an intron. hi a prefen-ed embodiment, a 5' UTR and a 3' UTR of a target 
gene are utilized in a reporter construct, hi another preferred embodiment, a 5' UTR and a 
3 ' UTR of a target gene and an intron-containing reporter gene are utilized in a reporter 
cons tract. 

5.1.1. Elements ofUntransIated Regions 
[00109] Any element of an untranslated region(s) of a target gene may be utilized in the 
reporter gene constructs described herein. Elements of an untranslated region(s) may be 
obtained and the nucleotide sequence of the elements determined by any method well- 
known to one of skill in the art. The nucleotide sequence of an element of an untranslated 
region for a target gene can be obtained, e.g., from the literature or a database such as 
GeiiBatik. Alternatively, the nucleotide sequence of an element of an untranslated region of 
a target gene maybe generated from nucleic acid from a suitable source. If a clone 
containing a nucleic acid of an element of an untranslated region of a target gene is not 
available, but the sequence of the element is known, a nucleic acid of the element may be 

chemically synthesized or obtained from a suitable source (e.g., a cDNA library) by PGR 
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amplification. Once the nucleotide sequence of an element is determined, the nucleotide 
sequence of the element maybe manipulated using methods well-known in the art for the 
manipulation of nucleotide sequences, e.g., recombinant DNA techniques, site directed 
mutagenesis, PCR, etc. (see, for example, the techniques described in Sambrook et al., 
1990, Molecular Cloning, A Laboratory Manual, 2d Ed., Cold Spring Harbor Laboratory, 
Cold Spiing Harbor, NY and Ausubel et al., eds., 1998, Current Protocols in Molecular 
Biology, John Wiley & Sons, NY, which are both incorporated by reference herein in their 
entireties), to generate an element having a different nucleic acid sequence. 
[00110] hi one embodiment, an element(s) of an untranslated region comprises the full- 
length sequence of a UTR, e.g., the 5' UTR or the 3' UTR. hi a specific embodiment, an 
element(s) of an untranslated region that has been shown or has been suggested to be 
involved in the regulation of mRNA stabihty and/or translation is utilized in the reporter 
constructs described herein. Examples of elements of an untranslated region which may be 
utilized in the reporter constructs described herein include, but are not limited to, an IRE, 
IRES, uORF, MSL-2, G quaitet element, 5 '-terminal oligopyrimidine tract ("TOP"), ARE, 
SECTS, liistone stem loop, CPE, nanos translational control element, APP, TGE/DRE, BRE, 
and a 15-LOX-DICE. 

5.1.1.1. Iron Response Element 

[00111] The maintenance of cellular iron homeostasis occurs at the level of mRNA 

stability and translation. Two components of this regulatory system have been defined: a 

cw-acting mRNA sequence/structure motif called an iron-responsive element ("IRE") and a 

specific trans-acting cytoplasmic binding protein, referred to herein as IRE-binding protein 

("IRE-BP") (reviewed in, e.g., Mikulits et al, 1999, Mutat Res. 437(3):2 19-30; Harrison & 

Arosio, 1996, Biochim Biophys Acta. 1275(3):161-203; Kuhn & Hentze, 1992, J Inorg 

Biochem. 1992, 47(3-4): 183-95; and Harford & Klausner, 1990, Enzyme 44(l-4):28-41, the 

disclosures of which are hereby incorporated by reference in their entireties). Iron scarcity 

induces binding of IRE-BPs to a single IRE in the 5' UTR of ferritin, eALAS, aconitase, 

erythroid 5-aminolevulinic acid synthase, and SDHb mRNAs, which specifically suppresses 

translation initiation. Simultaneous interaction of IRE-BPs with multiple IREs in the 3 ' 

UTR of transferrin receptor mRNA selectively causes its stabilization. The pattern is 

reverted under iron overload: IRE-BP mRNA binding affinity is reduced, which results in 

efficient protein synthesis of target transcripts harboring IREs in the 5' UTR and rapid 

degradation of transferrin mRNA. Any gene containing an IRE including, but not limited 

to, the IREs described in the references cited above, can be used in the present invention to 

identify compounds that modulate untranslated region-dependent gene expression. 
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5.1.1.2. Internal Ribosome Entry Site 
[001 12] The internal ribosome entry site ("IRES") is one of the better characterized 5 ' 
UTR-based cw-acting elements of post-transcriptional gene expression control. IRESes 
facilitate cap-independent translation initiation by recruiting ribosomes directly to the 5' 
UTR of the mRNA. IRESes are commonly located in the 3' region of 5' UTR and are, as 
recent work has estabhshed, frequently composed of several discrete sequences. IRESes do 
not share significant primary structure homology, but do form distinct RNA tertiary 
structures. Some IRESes contain sequences complementary to 18S RNA and therefore may 
form stable complexes with 408 ribosomal subunit and initiate assembly of translationally 
competent complex. A classic example of an "RNA-only" IRES is the internal ribosome 
entry site from Hepatitis C virus. However, most known IRESes require protein co-factors 
for activity. More than 10 IRES trans-acting factors ("ITAFs") have been identified so far. 
ha addition, all canonical translation initiation factors, with the sole exception of 5' end 
cap-binding eIF4E, have been shown to participate in IRES -mediated translation initiation 
(reviewed in Vagner et al., 2001, EMBO Reports 2:893 and Translational Control of Gene 
Expression, Soncnberg, Hershey, and Mathews, eds., 2000, CSHL Press, the disclosures of 
which are incorporated by reference in their entireties). 

[00113] IRES were first identified in picomaviruses (see, e.g., Pettetier & Sonenberg, 
1988, Nature, 334:320-325). The 5' UTRs of all picomaviruses are long and mediate 
translational initiation by directly recruiting and binding ribosomes, thereby circumventing 
the initial cap-binding step. Although IRES elements are frequently found in viral inRNAs, 
they are rarely found in non- viral mRNAs. The non- viral mRNAs shown to contain 
functional IRES elements in their respective 5' UTRs include those encoding 
immunoglobulin heavy chain binding protein ("BiP") (see, e.g., Macejak et al., 1991, 
Nature, 35390-4); Drosophila Antennapedia (see, e.g.. Oh et al., 1992, Genes Dev 6:1643- 
53) andUltrabithorax (see, e.g, Ye et al., 1997, Mol. Cell Biol. 17:1714-21); fibroblast 
growth factor 2 (see, e.g., Vagner et al., 1995, Mol. Cell Biol. 15:35-44); initiation factor 
eIF4G (see, e.g., Gan et al., 1998, J. Biol. Chem. 273:5006-12); proto-oncogene c-myc (see, 
e.g., Nanbru et al., 1995, J. Biol. Chem. 272:32061-6 and Stoneley, 1998, Oncogene 
16:423-8); vascular endotheUal growth factor ("VEGF") (see, e.g., Stein et al., 1998, Mol. 
Cell Biol. 18:3112-9), and X-linlced inhibitor of apoptosis protein ("XIAP") (see, e.g, U.S. 
Patent Nos. 6,159,709 and 6,171,821), the disclosures of which are incorporated by 
reference in their entireties. Any gene containing an IRES including, but not limited to, the 
IRESes described in the references cited above, can be used in the present invention to 
identify compounds that modulate imtranslated region-dependent gene expression. 
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5.1.1.3. Male Specific Lethal Element 

[00114] Male-specific expression of the protein male-specific-lethal 2 ("MSL-2") 
controls dosage compensation in Drosophila. MSL-2 protein is not produced in females 
and sequences ia both the 5' and 3' UTRs are important for this sex-specific regulation 
because msl-2 gene expression is inhibited in females by Sex-lethal ("SXL"), an RNA 
binding protein known to regulate pre-mRNA splicing. An intron present in the 5' 
untranslated region of msl-2 mKNA contains putative SXL binding sites and is retained in 
female flies. The msl-2 pre-mRNA is alternatively spliced in a Sex -lethal-dependent 
fashion (see, e.g., Gebauer et al., 1998, RNA 4(2):142-50 and Bashaw & Baker, 1995, 
Development 121(10):3245-58, the disclosures of which are hereby incoiporated by 
reference in their entireties). Any gene containing an MSL-2 element including, but not 
limited to, the MSL-2 elements described in the references cited above, can be used in the 
present invention to identify compo\mds that modulate untranslated region-dependent gene 
expression. 

5.1.1.4. G-quartet Element 

[00115] A symmetrical structure of two tetrads of guanosine base pairs connected by 
three loops is commonly referred to as a "G-quartet", "G-quadruplex" or "G-tetraplex" 
structure (see, e.g., Wang et al., 1993, Biochemistry 32:1899-1904; Macaya et al., 1993, 
Proc. Natl. Acad. Sci. 90:3745-3749; Schultze et al., 1994, J. Mol. Biol. 235:1532-1547; 
and Kelly et al., 1996, J. Mol. Biol. 256:417-422, the disclosures of which are incorporated 
by reference in their entireties). A G-quartet element was first identified as a conserved 
consensus sequence GGNTGGN2-5GGNTGG (SEQ ID NO: 1), which was present in 
single-stranded DNA aptamers that bind thrombin and inhibited thrombin-catalyzed fibrin- 
clot formation (see, e.g.. Bock et al., 1992, Nature 355:564-566, the dislosure of which is 
incorporated by reference in its entirety). A similar sequence in which the G-quartet 
structure is maintained when the length of the oligonucleotide between the G pairs is 
increased has been identified (see, e.g., Dias et al., 1994, J. Am. Chem. Soc. 1 16:4479- 
4480, the disclosure of which is incorporated by reference ia its entirety). 
[0100] A G-quartet element has been identified in mRNAs associated with firagile X 
mental retardation syndrome (reviewed in, e.g., Baxdoni & Mandel, 2002, Curr Opin Genet 
Dev 12(3):284-93, the disclosure of which is incorporated by reference in its entirety). The 
fi-agile X mental retardation syndrome is caused by large methylated expansions of a CGG 
repeat in the FMRl gene that lead to the loss of expression of FMRP, an RNA-binding 
protein. FMRP is proposed to act as a regulator of mRNA transport or translation that plays 
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a role m synaptic maturation arid function and has been shown to interact preferentially with 
itlRNAs containing a G quartet structure. 

[0101] G-quartet oligonucleotides can have the sequence GGNxGGNyGGNzGG (SEQ 
ID NO: 2), wherein x, y and z indicate a variable number of nucleotides (see, e.g., U.S. 
Patent No. 5,691,145, the disclosure of which is incorporated by reference in its entirety). 
Wliile X, y and z are each typically at least about 2, preferably about 2-10, these segments 
may be longer if desired The regions of variable sequence (i.e., N^NyNz) are not critical in 
the present invention and can be vaiied in length and sequence without disrupting the 
characteristic G-quartet structure. As a general rule, the variable N sequences should not be 
self-complementary and should not contain G residues which would result in alternative G- 
quartet structures within the molecule. Representative G-quartet ohgonucleotides are 15-20 
nucleotides in length, but G-quartet oligonucleotides of any length which conform to the 
general formula GGNxGGNyGGNzGG (SEQ ID NO: 3) are also suitable. The G-quartet 
oligonucleotide is typically about 14-30 nucleotides in length. Any gene containing a G- 
quartct element including, but not limited to, the G-quartet elements described in the 
references cited above, can be used in the present invention to identify compounds that 
modulate untranslated region-dependent gene expression. 

5.1.1.5. 5'-terminal Oligopvrimldine Tract 
[0102] Translation control can be mediated by a terminal ohgopyrimidine element 
("TOP") present in the 5' untranslated region of ribosomal protein-encoding mRNAs. TOP 
elements adopt a specific secondary structure that prevents ribosome-binding and 
translation-initiation of ribsomal protein-encoding mRNAs. However, binding of cellular 
nucleic acid binding protein ("CNBP") or La proteins to the TOP hairpin structure abolishes 
the TOP-mediated transcription block and induces ribosomal protein production (see, e.g., 
Schlatter & Fussenegger, 2003, Biotechnol Bioeng 81(1): 1-12; Zhu et al., 2001, Biochim 
Biophys Acta 1521(l-3):19-29; and Crosio et al., 2000, Nucleic Acids Res. 28(15):2927-34, 
the disclosures of which are incorporated by reference in their entireties). 
[01 03] The immunosuppressant rapamycin selectively suppresses the translation of 
mRNAs containing a TOP tract adjacent to the cap structure, rrawa'-acting factors, some of 
which are regulated by rapamycin-responsive signaling pathways, that bind to the 5' 
untranslated region of TOP mRNAs may be involved in selective translational repression 
(see, e.g., Kakegawa et al., 2002, Arch Biochem Biophys 402(l);77-83, the disclosure of 
which is incorporated by reference in its entirety). Any gene containing a TOP element 
including, but not limited to, the TOP elements described in the references cited above, can 
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be used in the present invention to identify compoimds that modulate untranslated region- 
dependent gene expression. 



5.1.1.6. Adenylate Uridylate-rich Element 
[0104] AU-rich elements ("ARBs") are the most extensively studied 3' UTR-based 
regulatory signals. ARBs are the primary determinant of mRNA stability and one of the key 
determinants of mRNA translation initiation efficiency. 

[0105] A typical ARE is 50 to 1 50 nt long and contains 3 to 6 copies of AU3A pentamer 
embedded in a generally A/U-enriched RNA region. The AU3A pentamers can be scattered 
within the region or can stagger or even overlap (Chen et al., 1995, Trends Biol. Sciences 
20:465, the disclosure of which is incorporated by reference in its entirety). One or several 
AU3A pentamers can be replaced by expanded versions such as an AU4A hexamer or AU5A 
heptamer (see, e.g., Wilkund et al., 2002, J. Biol. Chem. 277:40462 and Tholanikunnel & 
Malbom, 1997, J. Biol. Chem. 272:1 1471, the disclosures of which are incorporated by 
reference in their entireties). Single copies of the AUnA (where n = 3, 4, or 5) elements 
placed in a random sequence context are inactive. The minimal active ARE has been 
determined to have the sequence U2AUnA([J/A)(U/A) (where n = 3, 4, or 5) (see, e.g., 
Worfhington et al., 2002, J Biol Chem, 277:48558-64) the disclosure of which is 
incorporated by reference in its entkety). The activity of certain AU-rich elements in 
promoting mRNA degradation is enhanced in the presence of distal uridine-rich sequences. 
These U-rich elements do not affect mRNA stability when present alone and thus that have 
been termed "ARE enhancers" (see, e.g., Chen et al., 1994, Mol. Cell. Biol. 14:416, the 
disclosure of which is incorporated by reference in its entirety). 
[0106] Most AREs function in mRNA decay regulation and translation initiation 
regulation by interacting with specific ARE-bindmg proteins ("AUBPs"). There are at least 
14 known cellular proteins that bind to AU-rich elements. AUBP functional properties 
determine ARE involvement in one or both pathways. For example, ELAV/HuR binding to 
c-fos ARE inhibits c-fos mRNA decay (see, e.g., Brennan & Steitz, 2001, Cell Mol Life 
Sci. 58:266), association of tristetraprolin with TNFa ARE dramatically enhances TNFa 
mRNA hydrolysis (see, e.g., Carballo et al., 1998, Science 281:1001), whereas interaction 
of TIA-1 with the TNFa ARE does not alter the TNFa mRNA stability but inhibits TNFa 
translation (see, e.g., Piecyk et al., 2000, EMBO J. 19:4154). 
[0107] Since AREs are clearly important in biological systems, including but not 
limited to a number of the early response genes that regulate cell proliferation and responses 
to exogenous agents, ttie identification of compounds that bind to one or more of the ARE 

clusters and potentially modulate the stability and translation of the target RNA can 
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potentially be of value as a therapeutic. Any gene containing an ARE including, but not 
limited to, the ARBs described in the references cited above, can be used in the present 
invention to identify compounds that modulate untranslated region-dependent gene 
expression. 

5.1.1.7. Selenocvsteine Insertion Sequence 

(01 08] Selenium is an essential micronutrient fihat is now known to be incorporated as 
selenocysteine in a number of selenoproteins, glutathione peroxidase being the prototypical 
example. Selenocysteine is specifically encoded by the UGA codon, and inserted in peptide 
chains by a cotranslational mechanism that is able to override the normal function of UGA 
as a termination codon. In eukaryotes, efficient selenocysteine incorporation at UGA 
codons requires a cellular protein factor and a cw-acting structural signal usually located in 
the mRNA 3' untranslated region, consisting of a selenocysteine insertion sequence 
("SECIS") in a characteristic stem-loop structure (see, e.g., Peterlin et al., 1993, "Tat Trans- 
Activator" In Human Retroviruses; Cullen, Ed.; Oxford University Press: New York; pp. 
75-100; Le & Maizel, 1989, J. Theor. Biol. 138:495-510; and reviewed in Hubert et al., 
1996, Biochimie 78(7):590-6, the disclosures of which are incorporated by reference in 
their entketies). The required protein factor is presumed to be present in certain cells types 
that express selenoproteins, such as liver cells, lymphocytes, macrophages, thrombocytes, 
arid other blood cells. In such cell types, the presence of a SECIS element in an mRNA is 
necessary and sufficient for in-fi-ame UGA codons to be translated as selenocysteine. 
[01 09] A SECIS element is usually UAAAG, although other SECIS elements have been 
identified or variants have been constructed (see, e.g., U.S. Patent Nos. 6,303,295, 
5,849,520, and 5,700,660, the disclosures of which are incorporated by reference in their 
entireties). Any gene containing a SECIS element including, but not limited to, the SECIS 
elements described in the references cited above, can be used in the present invention to 
identify compounds that modulate untranslated region-dependent gene expression. 

5.1.1.8. Histone Stem Loop 

[0110] RepUcation-dependent histone mRNAs end with a conserved 26-nucleotide 
sequence that contains a 1 6-nucleotide stem-loop, i.e., the histone stem loop, instead of a 
poly(A) tail. Formation of the 3' end of histone mRNA occurs by endonucleolytic cleavage 
of pre-mRNA releasing the mature mRNA from the cliromatin template. Cleavage requires 
several trans-actiag factors, including a protein, the stem-loop binding protein, which binds 
the 26-nucleotide sequence, and a small nuclear RNP, U7 snRNP (reviewed in, e.g.. 
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Domiiaski & Marzluff, 1999, Gene 239(1):1-14, the disclosure of which is incorporated, by 
reference in its entirety). 

[0111] Sequences of histone stem loops have been described in U.S. Patent Nos. 
6,476,208; 6,455,280; 6,399,373; 6,346,381; 6,335,170; 6,331,396; 6,265,546; 6,265,167; 
5,990,298; 5,908,779 and 5,843,770, the disclosures of which are incorporated by reference 
in their entireties. Any gene containing a histone stem loop including, but not limited to, 
the histone stem loops described in the references cited above, can be used in the present 
invention to identify compounds that modulate untranslated region-dependent gene 
expression. 

5.1.1.9. Cytoplasmic Polyadenylatiou Element 

[0112] Maturation-specific polyadenylation in Xenopus oocytes depends on the 
presence of a U-rich cytoplasmic polyadenylation element ("CPE") close to the 3' end of 
the RNA. RNAs that lack CPEs appear to be deadenylated by default when meiosis 
resumes. This default program also applies to maturing mouse oocytes (see, e.g., Paynton 
& Bachvarova, 1994, Mol Reprod Dev 37(2): 172-80, the disclosure of which is 
incorporated by reference in its entirety). CPEs have been identified in Weel protein 
tyrosine kinase mRNA (see, e.g., Charlesworth et al., 2000, Dev Biol 227(2):706-19, the 
disclosure of which is incorporated by reference in its entirety), cyclin Bl mRNA (see, e.g., 
Tay et al., 2000, Dev Biol 221(l):l-9 and BarkofFet al., 2000, Dev Biol 220(1):97-109, the 
disclosures of which are incorporated by reference in their entireties), and Xenopus Id3 
mRNA (see, e.g., Afouda et al., 1999, Mech Dev 88(1):15-31, the disclosure of which is 
incorporated by reference in its entirety). 

[0113] A Xenopus oocyte CPE binding protem ("CPEB") binds the CPE and stimulates 
polyadenylation. CPEB is essential for the cytoplasmic polyadenylation of B4 RNA, GIO, 
c-mos, cdk2, cyclins Al, Bl and B2 mRNAs which suggests that this protein is required for 
polyadenylation of most RNAs during oocyte maturation (see, e.g., Stebbins-Boaz et al., 
1996, EMBO J 15(10):2582-92, the disclosure of which is incorporated by reference in its 
entkety). Any gene containing a CPE including, but not limited to, the CPEs described in 
the references cited above, can be used in the present invention to identify compounds that 
modulate untranslated region-dependent gene expression. 

5.1.1.10. Naaos Translational Control Element 

[0114] The nanos translational control element is a discrete translational control 
element within the nanos 3' untranslated region that acts independently of the locahzation 
signal to mediate translational repression of unlocalized nanos RNA (see, e.g., Clark et al., 
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2002, Development 129(14):3325-34; Clark et al., 2000, Curr Biol 10(20):1311-4; Crucs et 
al., 2000, Mol Cell 5(3):457-67; Bergsten & Gavis, 1999, Development 126(4):659-69; 
Dahanukar & Wharton, 1996, Genes Dev (20):2610-20; and Gavis et al., 1996, 
Development 122(9):2791-800, the disclosures of which are incorporated by reference in 
their entireties). 

[01151 During Drosophila embryogenesis, the Smaug protein represses translation of 
the nanos protein through an interaction with the nanos translational control element (see, 
e.g., Green et al., 2002, Biochem Biophys Res Commim 297(5): 1085-8, the disclosure of 
which is incorporated by reference in its entirety). Any gene containing a nanos 
translational control element including, but not limited to, the nanos translational control 
elements described in the references cited above, can be used in the present invention to 
identify compounds that modulate untranslated region-dependent gene expression. 

5.1.1.11. Amyloid Precursor Protein Element 

[0116] In one embodiment, tiie amyloid precursor protein element ("APP" element) 
refers to a novel iron-responsive element within the 5' untranslated region of the 
Alzheimer's amyloid precursor protein ("APP") transcript (+5 1 to +94 from the 5 '-cap site) 
(see, e.g., Rogers et al., 2002, J Biol Chem 277(47):455 18-28). The APP mRNA IRE is 
located immediately upstream of an interleukin-1 responsive acute box domain (+101 to 
+146). The APP 5' UTR conferred translation was selectively down-regulated in response 
to intracellular iron chelation. 

[0117] In another embodiment, the APP element refers to a 29 base instability element 
in the 3' UTR of the amyloid precursor protein involved in mRNA stability (see, e.g., 
Westmark & Malter, 2001, Brain Res Mol Brain Res 90(2): 193-201; Rajagopalan & Malter, 
2000, J Neurochem 74(l):52-9; Amara et al., 1999, Brain Res Mol Brain Res 71(l):42-9; 
and Zaidi & Malter, 1995, J Biol Chem 270(29): 17292-8, the disclosures of which are 
incorporated by reference in their entireties). Any gene containing a APP element 
including, but not limited to, the APP elements described in the references cited above, can 
be used in the present invention to identify compounds that modulate untranslated region- 
dependent gene expression. 

5.1.1.12. Translation Regulation Element 

[01 1 8] Negative translational control elements in 3 ' UTRs regulate pattern formation, 
cell fate, and sex determination in a variety of organisms, tra-2 mRNA in Caenorhabditis 
elegans is required for female development but must be repressed to permit 
spermatogenesis in hermaphrodites. Translational repression of tra-2 mRNA in C. elegans 
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is mediated by tandetnly repeated elements in its 3' UTR; these elements are called TGEs 
(for tra-2 and GLI element) (see, e.g., Thompson et al, 2000, Mol Cell Biol 20(6):2129-37; 
Haag & Kimble, 2000, Genetics 155(1):105-16; and Jan et al., 1997, EMBO J 16(20):6301- 
13, the disclosures of which are incorporated by reference in their entireties). Any gene 
containing a TGE including, but not limited to, the TGEs described m the references cited 
above, can be used in the present invention to identify compounds that modulate 
untranslated region-dependent gene expression. 

5.1.1.13. Direct Repeat Element 

[01 19] The direct repeat element ("DRE") is one control element in the 3 ' UTR of the 
tra-2 mRNA that causes repression of tra-2, i.e., inhibits translation of tra-2 mRNA, which 
is responsible for the onset of hermaphrodite spermatogenesis m C. elegans (see, e.g., 
Goodwin et al., 1993, Cell 75:329-339, the disclosure of which is incorporated by reference 
in its entirety). Three germline-specific regulators have been identified that mediate DRE 
regulation by the tra-2 3 ' UTR. These include DRPQ2/GLD-1 , a protein that specifically 
binds the DRE (see, e.g., Goodwin et al., 1993, Cell 75:329-339) and controls tra-2 
translation (see, e.g., Jan et al. 1999, EMBO J. 18:258-269); FOG-2, a protein that binds 
GLD-1 and is required for the onset of hermaphrodite spermatogenesis (see, e.g., Schedl & 
Kimble, 1988, Genetics 119:43-61); and laf-1, a gene that has not yet been identified at the 
molecular level (see, e.g., Goodwin et al., 1997, Development 124:749-758), the disclosures 
of which are incorporated by reference in their entireties. Any gene containing a DRE 
including, but not limited to, the DREs described in the references cited above, can be used 
in the present invention to identify compounds that modulate untranslated region-dependent 
gene expression. 

5.1.1.14. Brnno Response Element 

[0120] The Bruno Response Element ("BRE"), is located in the 3 ' untranslated region 
(UTR) of oskar mRNA (see, e.g., Castagnetti et al, 2000, Development 127(5):1063-8, the 
disclosure of which is incorporated by reference in its entirety). The coupled regulation of 
oskar mRNA localization and translation m time and space is critical for correct 
anteroposterior patterning of the Drosophila embryo. Locahzation-dependent translation of 
oskar mRNA, a mechanism whereby oskar RNA localized at the posterior of the oocyte is 
selectively translated and the unlocalized RNA remains in a translationally repressed state, 
ensures that Oskar activity is present exclusively at the posterior pole. Genetic experiments 
indicate that translational repression involves the binding of Bruno protein to multiple sites, 
the BREs, in the 3' untranslated region of oskar mRNA. Any gene containing a BRE 
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including, but not limited to, the BREs described in the references cited above, can be used 
in the present invention to identify compounds that modulate untranslated region-dependent 
gene expression. 

5.1.1.15. 15-lipoxvgenase Differeutiation Control Element 
[0121] The translation of 1 5-lipoxygenase ("LOX") mRNA in erythroid precursor cells 
and of the L2 mRNA of human papilloma virus type 16 (HPV-16) in squamous epithelial 
cells is silenced when either of these cells is immature and is activated in maturing cells by 
unknown mechanisms. It has been shown that hnKNP K and the c-Src kinase specifically 
interact with each other, leading to c-Src activation and tyrosiae phosphorylation of hnRNP 
K in vivo and in vitro, c-Src-mediated phosphorylation reversibly inhibits the binding of 
hnRNP K to the differentiation control element ("DICE") of the LOX mRNA 3' 
untranslated region in vitro and specifically derepresses the translation of DICE-bearing 
mRNAs in vivo (see, e.g., Ostareck-Lederer et al., 2002, Mol Cell Biol 22(13):4535-43, the 
disclosure of which is incorporated by reference in its entirety). 
[0122] Cytidine-rich 15-lipoxygenase differentiation control element ("1 5-LOX- 
DICE") is a multifimctional czs-acting element found in the 3' untranslated region of 
numerous eukaryotic mRNAs. It binds KH domain proteins of the type hnRNP E and K, 
thus mediating mRNA stabilization and translational control. Translational silencing is 
caused by formation of a simple binary complex between DICE and recombinant hnRNP 
El . Electromobility shift assays and sucrose gradient centrifugation demonstrate that rabbit 
15-LOX-DICE, which is composed often subunits of the sequence 
(CCCCPuCCCUCUUCCCCAAG, SEQ ID NO: 4), is able to bind up to ten molecules of 
hnRNP El (see, e.g., Reimann et al., 2002, J Mol Biol 3 15(5):965-74 and Thiele et al., 
1999, Adv Exp Med Biol 447:45-61, the disclosures of which are incorporated by reference 
in their entireties). Any gene containing a 15-LOX-DICE including, but not limited to, the 
15-LOX-DICEs described in the references cited above, can be used in the present 
invention to identify compounds that modulate untranslated region-dependent gene 
expression. 

5.2. Reporter Gene Constructs. Transfected Cells, and Cell-Free Extracts 
[0123] The invention provides for specific vectors containing a reporter gene flanked by 
one or more UTRs of a target gene and host cells fransfected with the vectors. The 
invention also provides for the in. vitro translation of a reporter gene flanked by one or more 
UTRs of a target gene. Techniques for practicing this specific aspect of this invention will 
employ, unless otherwise mdicated, conventional techniques of molecular biology, 
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microbiology, and recombinant DNA manipulation and production, which are routinely 
practiced by one of skill in the art. See, e.g., Sambrook, 1989, Molecular Cloning, A 
Laboratory Manual, Second Edition; DNA Cloning, Volumes I and II (Glover, Ed. 1985); 
Oligonucleotide Synthesis (Gait, Ed. 1984); Nucleic Acid Hybridization (Hames & 
Higgins, Eds. 1984); Transcription and Translation (Hames & Higgins, Eds. 1984); Animal 
Cell Culture (Freshney, Ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); 
Perbal, A Practical Guide to Molecular Cloning (1984); Gene Transfer Vectors for 
Mammahan Cells (Miller & Calos, Eds. 1987, Cold Spring Harbor Laboratory); Methods in 
Enzymology, Volumes 154 and 155 (Wu & Grossman, and Wu, Eds., respectively), (Mayer 
& Walker, Eds., 1987); Immunochemical Methods in Cell and Molecular Biology 
(Academic Press, London, Scopes, 1987), Expression of Proteins in Mammalian Cells 
Using Vaccinia Viral Vectors in Current Protocols in Molecular Biology, Volume 2 
(Ausubel et al., Eds., 1991). 

5.2.1. Reporter Genes 
[01241 Any reporter gene well-known to one of skill in the art may be used in reporter 
gene constructs to ascertain the effect of a compound on untranslated region-dependent 
expression of a target gene. Reporter genes refer to a nucleotide sequence encoding a 
protein that is readily detectable either by its presence or activity. Reporter genes may be 
obtained and the nucleotide sequence of the reporter gene determined by any method well- 
known to one of skill in the art. The nucleotide sequence of a reporter gene can be 
obtained, e.g., from the literature or a database such as GenBank. Alternatively, a 
polynucleotide encoding a reporter gene may be generated from nucleic acid from a suitable 
source. If a clone containing a nucleic acid encoding a particular reporter gene is not 
available, but the sequence of the reporter gene is known, a nucleic acid encoding the 
reporter gene may be chemically synthesized or obtained from a suitable source {e.g., a 
cDNA library, or a cDNA library generated from, or nucleic acid, preferably poly A+ RNA, 
isolated from, any tissue or cells expressing the reporter gene) by PGR amplification. Once 
the nucleotide sequence of a reporter gene is determined, the nucleotide sequence of the 
reporter gene may be manipulated using methods well-known in the art for the 
manipulation of nucleotide sequences, e.g., recombinant DNA techniques, site directed 
mutagenesis, PGR, (see, for example, the techniques described in Sambrook et al, 1990, 
Molecular Cloning, A Laboratory Manual, 2d Ed., Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY and Ausubel et al., eds., 1998, Current Protocols in Molecular Biology, 
John Wiley & Sons, NY, which are both mcorporated by reference herein in their 
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entireties), to generate reporter genes having a different amino acid sequence, for example 
to create amino acid substitutions, deletions, and/or insertions. 
[0125] Examples of reporter genes include, but are not limited to, luciferase (e.g., 
firefly luciferase, renilla luciferase, and click beetle luciferase), green fluorescent protein 
("GFP") (e.g., green fluorescent proteia, yellow fluorescent protein, red fluorescent protein, 
cyan fluorescent protein, and blue fluorescent protein), beta-galactosidase ("b-gal"), beta- 
glucoronidase, beta-lactamase, chloramphenicol acetyltransferase ("CAT"), and alkaline 
phosphatase ("AP"). In a preferred embodiment, a reporter gene utilized in the reporter 
constructs is easily assayed and has an activity which is not normally found in the cell or 
organism of interest. 

5.2.1.1. Luciferase 

[0126] Luciferases are enzymes that emit light in the presence of oxygen and a substrate 
(luciferin) and which have been used for real-time, low-hght imaging of gene expression in 
cell cultures, individual cells, whole organisms, and transgenic organisms (reviewed by 
Greer & Szalay, 2002, Luminescence 17(l):43-74). 

[0127] As used herein, the term "luciferase" is intended to embrace all luciferases, or 
recombinant enzymes derived firom luciferases which have luciferase activity. The 
luciferase genes from fireflies have been well characterized, for example, from the Photinus 
and Luciola species (see, e.g.. International Patent Publication No. WO 95/25798 for 
Photinus pyralis, European Patent Application No. EP 0 524 448 iox Luciola cruciata and 
Luciola lateralis, andDevine et al., 1993, Biochim. Biophys. Acta 1173(2):121-132 for 
Luciola mingrelica). Other eucaryotic luciferase genes include, but are not limited to, the 
click beetle {Photinus plagiophthalamus, see, e.g., Wood et al., 1989, Science 244:700- 
702), the sea panzy (Renilla reniformis, see, e.g., Lorenz et al., 1991, Proc Natl Acad Sci U 
S A 88(10):4438-4442), and the glow worm (Lampyris noctiluca, see e.g., Sula-Newby et 
al., 1996, Biochem J. 313:761-767). The click beetle is unusual in that different members 
of the species emit bioluminescence of different colors, which emit light at 546 nm (green), 
560 nm (yellow-green), 578 nm (yellow) and 593 nm (orange) (see, e.g, U.S. Patent Nos. 
6,475,719; 6,342,379; and 6,217,847, the disclosures of which are incorporated by reference 
in their entireties). Bacterial luciferin-luciferase systems include, but are not Iknited to, the 
bacterial lux genes of terrestrial Photorhabdus luminescens (see, e.g., Manukhov et al., 
2000, Genetika 36(3):322-30) and marine bacteria Vibrio fischeri and Vibrio harveyi (see, 
e.g., Miyamoto et al., 1988, J Biol Chem. 263(26):13393-9, and Cohn et al., 1983, Proc Natl 
Acad Sci USA., 80(1): 120-3, respectively). The luciferases encompassed by the present 
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invention also includes the mutant luciferases described in U.S. Patent No. 6,265,177 to 
Squirrell et al., winch is hereby incorporated by reference in its entirety. 
[0128] In a preferred embodiment, the luciferase is a firefly luciferase, a renilla 
luciferase, or a click beetle luciferase, as described in any one of the references listed supra, 
the disclosures of which are incorporated by reference in their entireties. 

5.2.1.2. Green Fluorescent Protein 
[0129] Green fluorescent protein ("GFP") is a 238 amino acid protein with amino acid 
residues 65 to 67 involved in the formation of the chromophore which does not require 
additional substrates or cofactors to fluoresce (see, e.g., Prasher et al., 1992, Gene 111:229- 
233; Yang et al., 1996, Nature Biotechnol. 14:1252-1256; and Cody et al., 1993, 
Biochemistry 32:1212-1218). 

[0130] As used herein, the term "green fluorescent protein" or "GFP" is intended to 
embrace all GFPs (including the various forms of GFPs which exhibit colors other than 
green), or recombinant enzymes derived from GFPs which have GFP activity. In a 
preferred embodiment, GFP includes green fluorescent protein, yellow fluorescent protein, 
red fluorescent protein, cyan fluorescent protein, and blue fluorescent protein. The native 
gene for GFP was cloned from the bioluminescent jellyfish yle^Morea victoria (see, e.g., 
Morin et al., 1972, J. Cell Physiol. 77:313-318). Wild type GFP has a major excitation peak 
at 395 nm and a minor excitation peak at 470 nm. The absorption peak at 470 nm allows 
the monitoring of GFP levels using standard fluorescein isothiocyanate (FITC) filter sets. 
Mutants of the GFP gene have been found useful to enhance expression and to modify 
excitation and fluorescence. For example, mutant GFPs with alanine, glycine, isoleucine, 
or threonine substituted for serine at position 65 result in mutant GFPs with shifts in 
excitation maxima and greater fluorescence than wild type protein when excited at 488 nm 
(see, e.g., Heim et al., 1995, Nature 373:663-664; U.S. Patent No. 5,625,048; Delagrave et 
al., 1995, Biotechnology 13:151-154; Cormack et al., 1996, Gene 173:33-38; and Cramer et 
al., 1996, Nature Biotechnol. 14:315-319). The ability to excite GFP at 488 nm permits the 
use of GFP with standard fluorescence activated cell sorting ("FACS") equipment, hi 
another embodiment, GFPs are isolated from organisms other than the jellyfish, such as, but 
not limited to, the sea pansy, Renilla reriformis. 

[0131] Techniques for labeling cells with GFP in general are described in U.S. Patent 

Nos. 5,491,084 and 5,804,387, which are incorporated by reference in their entireties; 

Chalfie et al., 1994, Science 263:802-805; Heim et al., 1994, Proc. Nati. Acad. Sci. USA 

91:12501-12504; Morise et al., 1974, Biochemistry 13:2656-2662; Ward et al., 1980, 

Photochem. Photobiol. 31:611-615; Rizzuto et al., 1995, Curr. Biology 5:635-642; and 
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Kaether & Gerdes, 1995, FEES Lett 369:267-271. The expression of GFPs inE. coli and 
C. elegans are described in U.S. Patent No. 6,251,384 to Tan et al, which is incorporated 
by reference in its entirety. The expression of GFP in plant cells is discussed in Hu & 
Cheng, 1995, FEES Lett 369:331-33, and GFP expression mDrosophila is described in 
Davis et al., 1995, Dev. Eiology 170:726-729. 

5.2.1.3. Beta-galactosidase 

[0132] Beta galactosidase ("b-gal") is an enzyme that catalyzes the hydrolysis of b- 
galactosides, including lactose, and the galactoside analogs o-nitrophenyl-b-D- 
galactopyranoside ("ONPG") and chlorophenol red-b-D-galactopyranoside ("CPRG") (see, 
e.g., Nielsen et al., 1983 Proc Natl Acad Sci USA 80(17):5198-5202; Eustice et al., 1991, 
Biotechniques 11:739-742; and Henderson et al., 1986, Chn. Chem. 32:1637-1641). Theb- 
gal gene fimctions weU as a reporter gene because the protein product is extremely stable, 
resistant to proteolytic degradation in cellular lysates, and easily assayed. When ONPG is 
used as the substrate, b-gal activity can be quantitated with a spectrophotometer or 
microplate reader. 

[0133] As used herein, the term "beta galactosidase" or "b-gal" is intended to embrace 
all b-gals, including lacZ gene products, or recombinant enzymes derived from b-gals 
which have b-gal activity. The b-gal gene functions well as a reporter gene because the 
protein product is extremely stable, resistant to proteolytic degradation in cellular lysates, 
and easily assayed. In an embodiment where ONPG is the substrate, b-gal activity can be 
quantitated with a spectrophotometer or microplate reader to determine the amount of 
ONPG converted at 420 mn. In an embodiment when CPRG is the substrate, b-gal activity 
can be quantitated with a spectrophotometer or microplate reader to determine the amount 
of CPRG converted at 570 to 595 nm. Li yet another embodiment, the b-gal activity can be 
visually ascertained by plating bacterial cells transformed witii a b-gal construct onto plates 
containing Xgal and ff TG. Bacterial colonies that are dark blue indicate the presence of 
high b-gal activity and colonies that are varying shades of blue indicate varying levels of b- 
gal activity. 

5.2.1.4. Beta-glttcoronidase 
[0134] Beta-glucm-onidase ("GUS") catalyzes the hydrolysis of a very wide variety of 
b-glucuronides, and, with much lower efficiency, hydrolyzes some b-galacturonides. GUS 
is very stable, will tolerate many detergents and widely varying ionic conditions, has no 
cofactors, nor any ionic requirements, can be assayed at any physiological pH, with an 
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Optimum between 5.0 and 7.8, and is reasonably resistant to thermal inactivation (see, e.g., 
U.S. Patent No. 5,268,463, which is incorporated by reference in its entirety). 
[0135] In one embodiment, the GUS is derived from the Esherichia coli b- 
glucuronidase gene, hi alternate embodiments of the invention, the b-glucuronidase 
encoding nucleic acid is homologous to the E. coli b-glucuronidase gene and/or maybe 
derived from another organism or species. 

[0136] GUS activity can be assayed either by fluorescence or specfrometry, or any other 
method described in U.S. Patent No. 5,268,463, the disclosure of which is incorporated by 
reference in its entirety. For a fluorescent assay, 4-trifluoromethylumbelhferyl b-D- 
.glucuronide is a very sensitive substrate for GUS. The fluorescence maximum is close to 
500 nm~bluish green, where very few plant compounds fluoresce or absorb. 4- 
trifluoromethylumbelliferyl b-D-glucuronide also fluoresces much more strongly near 
neutral pH, allowing continuous assays to be performed more readily than with MUG. 4- 
trifluoroniethylumbelliferyl b-D-glucuronidc can be used as a fluorescent indicator m vivo. 
The spectrophotometric assay is very straightforward and moderately sensitive (Jefferson et 
al., 1986, Proc. Natl. Acad. Sci. USA 86:8447-8451). A prefeiTed substrate for 
spectrophotometric measurement is p-nitrophenyl b-D-glucuronide, which when cleaved by 
GUS releases the chromophore p-nitrophenol. At a pH greater than its pKa (around 7. 1 5) 
the ionized chromophore absorbs hght at 400-420 ntn, giving a yellow color. 

5.2.1.5. Beta-lactamase 
[0137] Beta-lactamases are nearly optimal enzymes in respect to their almost difEusion- 
controUed catalysis of b-lactam hydrolysis, making them suited to the task of an 
intracellular reporter enzyme (see, e.g., Christensen et al., 1990, Biochem. J. 266: 853-861). 
They cleave the b-lactam ring of b-lactam antibiotics, such as penicillins and 
cephalosporins, generating new charged moieties in the process (see, e.g., O'Callaghan et 
al., 1968, Antimicrob. Agents. Chemother. 8: 57-63 and Stratton, 1988, J. Antimicrob. 
Chemother. 22, Suppl. A: 23-35). A large number of b-lactamases have been isolated and 
characterized, all of which would be suitable for use in accordance with the present 
invention (see, e.g., Richmond & Sykes, 1978, Adv.Microb.Physiol. 9:31-88 and Ambler, 
1980, Phil. Trans. R. Soc. Lond. [Ser.B.] 289: 321-331, the disclosures of which are 
incorporated by reference in their entireties). 

[0138] The coding region of an exemplary b-lactamase employed has been described in 

U.S. Patent No. 6,472,205, Kadonaga et al., 1984, J.Biol.Chem. 259: 2149-2154, and 

SutclifiFe, 1978, Proc. Natl. Acad. Sci. USA 75: 3737-3741, the disclosures of which are 

incorporated by reference in their entireties. As would be readily apparent to those skilled 
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m me lield, tms ana otner comparable sequences for peptides having b-lactamase activity 
M^ould be equally suitable for use in accordance with the present invention. The 
combination of a fluorogenic substrate described in U.S. Patent Nos. 6,472,205, 5,955,604, 
and 5,741,657, the disclosures of which are incorporated by reference in their entireties, and 
a suitable b-lactamase can be employed in a wide variety of different assay systems, such as 
are described in U.S. Patent No. 4,740,459, which is hereby incorporated by reference in its 
entirety. 

5.2.1.6. Chloramphenicol Acetvltransferase 

[0139] Chloramphenicol acetyl transferase ("CAT") is commonly used as a reporter 
gene in mammalian cell systems because mammalian cells do not have detectable levels of 
CAT activity. The assay for CAT involves incubating cellular extracts with radiolabeled 
chloramphenicol and appropriate co-factors, separating the starting materials from the 
product by, for example, thin layer chromatography ("TLC"), followed by scintillation 
counting (see, e.g., U.S. Patent No. 5,726,041, which is hereby incorporated by reference in 
its entirety). 

[0140] As used herein, the term "chloramphenicol acetyltransferase" or "CAT" is 
intended to embrace all CATs, or recombinant enzymes derived from CAT which have 
CAT activity. While it is preferable that a reporter system which does not require cell 
processing, radioisotopes, and chromatographic separations would be more amenable to 
high through-put screening, CAT as a reporter gene may be preferable in situations when 
stability of the reporter gene is important. For example, the CAT reporter protein has an in 
vivo half life of about 50 hours, which is advantageous when an accumulative versus a 
dynamic change type of result is desired. 

5.2.1.7. Secreted Alkaline Phosphatase 

[0141] The secreted alkaline phosphatase ("SEAP") enzyme is a truncated form of 
alkaline phosphatase, in which the cleavage of the transmembrane domain of the protein 
allows it to be secreted from the cells into the surrounding media. In a preferred 
embodiment, the alkaline phosphatase is isolated from human placenta. 
[0142] As used herein, the term "secreted alkaline phosphatase" or "SEAP" is intended 
to embrace all SEAP or recombinant enzymes derived from SEAP which have alkaline 
phosphatase activity. SEAP activity can be detected by a variety of methods including, but 
not limited to, measurement of catalysis of a fluorescent substrate, immunoprecipitation, 
HPLC, and radiometric detection. The luminescent method is preferred due to its increased 
sensitivity over calorimetric detection methods. The advantages of using SEAP is that a 
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cell lysis step is not requirea since the SEAP protein is secreted out of the cell, which 
facilitates the automation of sampling and assay procedures. A cell-based assay using 
SEAP for use in cell-based assessment of inhibitors of the Hepatitis C virus protease is 
described in U.S. Patent No. 6,280,940 to Potts et al. which is hereby incorporated by 
reference in its entirety. 

5.2.2. Reporter Gene Constructs 
[0143] The invention provides reporter gene constructs for use in the reporter gene- 
based assays described herein for the identification of compounds that modulate 
untranslated region-dependent expression of a target gene. The reporter gene constructs of 
the invention comprise one or more reporter genes fused to one or more untranslated 
regions. For example, specific RNA sequences, RNA structural motifs, and/or RNA 
structural elements that are known or suspected to modulate untranslated region-dependent 
expression of a target gene may be fused to the reporter gene. 
[0144] The present invention provides for a reporter gene flanked by one or more 
untranslated regions {e.g., the 5' UTR, 3' UTR, or both the 5' UTR and 3' UTR of the 
target gene). The present invention also provides for a reporter gene flanked by one or 
more UTRs of a target gene, said UTRs containing one or more mutations (^e.g., one or 
more substitutions, deletions and/or additions). In a preferred embodiment, the reporter 
gene is flanked by both 5' and 3' UTRs so that compounds that interfere with an interaction 
between the 5' and 3' UTRs can be identified. In another preferred embodiment, a stable 
hairpin secondary structure is inserted into the UTR, preferably the 5' UTR of the target 
gene. For example, in cases where the 5' UTR possesses IRES activity, the addition of a 
stable hairpin secondary structure in the 5' UTR can be used to separate cap-dependent 
from cap-independent translation (see, e.g., Muhlrad et al., 1995, Mol. Cell. Biol. 
15(4):2145-56, the disclosure of which is incorporated by reference in its entirety). In 
another embodiment, an intron is inserted into a UTR (preferably, the 5' UTR) or at the 5' 
end of an ORF of a target gene. For example, but not by limitation, in cases where an RNA 
possesses instability elements, an intron, e.g., the human elongation factor one alpha (EF-1 
alpha) first nitron, can be cloned into a UTR (preferably, the 5' UTR) or a 5' end of the 
ORF to increase expression (see, e.g., Kim et al., 2002, J Biotechnol 93(2): 183-7, the 
disclosure of which is incorporated by reference in its entirety). In a preferred embodiment, 
both a stable hahpin secondary structure and an intron are added to the reporter gene 
construct. In a more preferred embodiment, the stable hairpin secondary structure is cloned 
into the 5' UTR and the intron is added at the 5' end of the ORF of tiie reporter gene. 
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[0145] The reporter gene can be positioned such that the translation of that reporter 
gene is dependent upon the mode of translation initiation, such as, but not limited to, cap- 
dependent translation or cap-independent translation (i.e., translation via an internal 
ribosome entry site). Alternatively, where the UTR contains an upstream open reading 
frame, the reporter gene can be positioned such that the reporter protein is translated only in 
the presence of a compound that shifts the reading frame of the UTR so that the formerly 
untranslated open reading frame is then translated. 

[0146] The reporter gene constructs can be monocistronic or multicistronic. A 
multicistronic reporter gene constfuct may encode 2, 3, 4, 5, 6, 7, 8, 9, 10 or more, or in the 
range of 2-5, 5-10 or 10-20 reporter genes. For example, a dicistronic reporter gene 
constmct comprising in tlie following order a promoter, a first reporter gene, a 5' UTR of a 
target gene, a second reporter gene and optionally, a 3 ' UTR of a target gene. In such a 
reporter constmct, the transcription of both reporter genes is driven by the promoter, 
whereas the translation of the mRNA from the first reporter gene is by a cap-dependent 
seaming mechanism and the translation of tlie mRNA from the second reporter gene is by a 
cap-independent mechanism by an IRES. The IRES-dependent translation of the mRNA of 
the second reporter gene can be normalized against cap-dependent franslation. 

5.2.3. Expression of Reporter Gene Constructs in Cells 
5.2.3.1. Vectors 

[0147] The nucleotide sequence coding for a reporter gene can be inserted into an 
appropriate expression vector, i.e., a vector which contains the necessary elements for the 
franscription and translation of the inserted protein-coding sequence. The necessary 
franscriptional and translational signals can also be supplied by the target gene or the 
reporter gene. A variety of host-vector systems may be utilized to express the reporter 
gene. These include, but are not limited to, mammalian cell systems infected with virus 
(e.g., vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g., 
baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria 
transformed with bacteriophage, DNA, plasmid DNA, or cosmid DNA; and stable cell hues 
generated by transfoimation using a selectable marker. The expression elements of vectors 
vary in their strengths and specificities. Depending on the host-vector system utilized, any 
one of a number of suitable franscription and franslation elements may be used. In specific 
embodimeaits, the reporter gene is expressed, or a fiision protein comprising the reporter 
gene and ORF of a fragment tiiereof, of the target gene is expressed. 
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[0148] Any of the methods previously described for the insertion of DNA fragments 
into a vector may be used to construct expression vectors containing a chimeric nucleic acid 
consisting of appropriate transcriptional/translational control signals and the protein coding 
sequences. These methods may include in vitro recombinant DNA and synthetic techniques 
and in vivo recombinants (genetic recombination). Expression of the reporter gene 
construct may be regulated by a second nucleic acid sequence so that the reporter gene is 
expressed in a host transformed with the recombinant DNA molecule. For example, 
expression of a reporter gene construct maybe controlled by any promoter/enliancer 
element known in the art, such as a constitutive promoter, a tissue-specific promoter, or an 
inducible promoter. Specific examples of promoters which may be used to control gene 
expression include, but are not Umited to, the SV40 eaiiy promoter region (Bemoist & 
Chambon, 1981, Nature 290:304-310), the promoter contained in the 3' long terminal 
repeat of Rous sarcoma virus (Yamamoto et al., 1980, Cell 22:787-797), the herpes 
thymidine kinase promoter (Wagner et al, 1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441- 
1445), the regulatory sequences of the metallothionein gene (Brinster et al., 1982, Nature 
296:39-42); prokaryotic expression vectors such as the ;8-lactamase promoter (Villa- 
Kamaroff et al, 1978, Proc. Natl. Acad. Sci. U.S.A. 75:3727-3731), or the tac promoter 
(DeBoer et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:21-25); see also "Usefiil proteins 
from recombinant bacteria" in Scientific American, 1980, 242:74-94; plant expression 
vectors comprising the nopaline synthetase promoter region (Herrera-Estrella et al. Nature 
303:209-213) or the cauUflower mosaic virus 35S RNA promoter (Gardner, et al., 1981, 
Nucl. Acids Res. 9:2871), and the promoter of the photo synthetic enzyme ribulose 
biphosphate carboxylase (Herrera-Estrella et al, 1984, Nature 310:115-120); promoter 
elements from yeast or other ftingi such as the Gal 4 promoter, the ADC (alcohol 
dehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, alkaline phosphatase 
promoter, and the following animal transcriptional control regions, which exhibit tissue 
specificity and have been utilized in transgenic animals: elastase I gene control region 
which is active in pancreatic acinar cells (Swift et al, 1984, Cell 38:639-646; Omitz et al, 
1986, Cold Spring Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987, Hepatology 
7:425-515); insulin gene control region which is active in pancreatic beta cells (Hanahan, 
1985, Nature 315:1 15-122), immunoglobulin gene control region which is active in 
lymphoid cells (Grosschedl et al, 1984, Cell 38:647-658; Adames et al, 1985, Nature 
318:533-538; Alexander et al, 1987, Mol. Cell. Biol. 7:1436-1444), mouse mammary 
tumor virus control region which is active in testicular, breast, lymphoid and mast cells 
(Leder et al, 1986, Cell 45:485-495), albumin gene confrol region which is active in liver 
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(i'lnkert et al., iy8 /, Uenes andDevel. 1:268-276), alpha-fetoprotein gene control region 
which is active in liver (Krumlauf et al., 1985, Mol. Cell. Biol. 5:1639-1648; Hammer et al., 
1987, Science 235:53-58; alpha l-antitrj'psin gene control region which is active in the liver 
(Kelsey et al, 1987, Genes and Devel. 1 : 161-171), beta-globin gene control region which is 
active in myeloid cells (Mogram et al, 1985, Nature 315:338-340; KolKas et al., 1986, Cell 
46:89-94; myelin basic protein gene control region which is active in oligodendrocyte cells 
in the brain (Readhead et al., 1987, Cell 48:703-712); myosin light chain-2 gene control 
region which is active in skeletal muscle (Sani, 1985, Nature 314:283-286), and 
gonadotropic releasing hormone gene control region which is active in the hypothalamus 
(Mason et al., 1986, Science 234:1372-1378). 

[0149] In a specific embodiment, a vector is used that comprises a promoter operably 
linked to a reporter gene flanked by one or more UTRs of a target gene, origins of 
replication from one or more species, and, optionally, one or more selectable markers {e.g., 
an antibiotic resistance gene). In a preferred embodiment, the vectors are CMV vectors, T7 
vectors, lac vectors, pCEP4 vectors, or 5.0/FRT vectors. 

[0150] In a specific embodiment, an expression construct is made by amplifying the 5' 
and/or 3' UTRs of a target gene and ligating the UTRs to a reporter gene such as luciferase, 
and subcloning them into apT-Adv vector (Clontech Laboratories, Palo Alto, California). 
It is understood by one of skill in the art that the construction of the reporter plasmid may 
require the construction of intemiediate plasmid s if several ligations are involved. 
[0151] Expression vectors containing the reporter gene construct of the present 
invention can be identified by four general approaches: (a) nucleic acid sequencing, (b) 
nucleic acid hybridization, (c) presence or absence of "marker" nucleic acid functions, and 
(d) expression of inserted sequences. In the first approach, the presence of the UTRs and/or 
the reporter gene inserted in an expression vector can be detected by sequencing. In the 
second approach, the presence of the UTRs and/or the reporter gene inserted in an 
expression vector can be detected by nucleic acid hybridization using probes comprising 
sequences that are homologous to the inserted UTRs and/or reporter gene. In the third 
approach, the recombinant vector/host system can be identified and selected based upon the 
presence or absence of certain "marker" nucleic acid functions (e.g., thymidine kinase 
activity, resistance to antibiotics, transformation phenotype, occlusion body formation in 
baculovirus, etc.) caused by the insertion of the nucleic acid of interest, i.e., the reporter 
gene construct, in the vector. For example, if the nucleic acid of interest is inserted within 
the marker nucleic acid sequence of the vector, recombinants containing the insert can be 
identified by the absence of the marker nucleic acid function. In the fourth approach, 
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recombinant expression vectors can be identified by assaying the reporter gene product 
expressed by the recombinant. Such assays can be based, for example, on the physical or 
fimctional properties of the particular reporter gene. 

[0152] In a preferred embodhnent, the reporter gene constructs are cloned into stable 
cell line expression vectors. In a preferred embodiment, the stable cell line expression 
vector contains a site specific genomic integration site, such as but not limited to, pCMPl 
(see, e.g., FIG. 8C in Example 10). hi another preferred embodiment, the reporter gene 
construct is cloned into an episomal mammaUan expression vector, such as, but not limited 
to, pCMR2 (see, e.g., FIG. 8B in Example 10). 

5.2.3.2. Transfection 
[0153] Once a vector encoding the appropriate gene has been synthesized, a host cell is 
transformed or transfected with the vector of interest. The use of stable transformants is 
preferred. In a preferred embodiment, tbe host cell is a mammalian cell. In a more 
preferred embodhnent, the host cell is a human cell. In another embodiment, the host cells 
are primary cells isolated from a tissue or other biological sample of interest. Host cells 
that can be used in the metbods of the present invention include, but are not Umited to, 
hybridomas, pre-B cells, 293 cells, 293T cells, HeLa cells, HepG2 cells, K562 cells, 3T3 
cells, MCF7 cells, SkBr3 cells, or BT474 cells. In another preferred embodiment, the host 
cells are derived from tissue specific to the target gene. In yet another preferred 
embodiment, the host cells are immortalized cell lines derived from a source, e.g., a tissue, 
specific to the target gene. Other host cells that can be used in the present invention 
include, but are not limited to, bacterial cells, yeast cells, virally-infected cells, or plant 
cells. 

[0154] Transformation may be by any known method for introducing polynucleotides 

into a host cell, for example by packaging the polynucleotide in a virus and transducing a 

host cell with the virus, and by direct uptake of the polynucleotide. The transformation 

procedure used depends upon the host to be transformed. Bacterial transformation by direct 

uptake generally employs treatment with calcium or rubidium chloride (see, e.g., Cohen, 

1972, Proc. Nat. Acad, Sci. USA 69:21 10 and Maniatis et al., 1982, "Molecular Clonmg; A 

Laboratory Manual" (Cold Spring Harbor Press, Cold Spring Harbor, N.Y.). Yeast 

transformation by direct uptake may be carried out using the method of Schiestl & Gietz, 

1989, Current Genetics 16:339-346 or Hinnen et al., 1978, Proc. Nat. Acad. Sci. USA 

75 : 1 929. Mammalian tiransformations (i. e.. transfections) by direct uptake may be 

conducted using the calcium phosphate precipitation method of Graham & Van der Eb, 

1978, Virol. 52:546, or the various known modifications thereof Other methods for 
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introducing recombinant polynucleotides into cells, particularly into mammalian cells, 
include dextran-mediated transfection, calcium phosphate mediated transfection, polybrene 
mediated transfection, protoplast fusion, electroporation, encapsulation of the 
polynucleotide(s) in hposomes, and direct microinjection of the polynucleotides into nuclei. 
Such methods are well-known to one of skill in the art. 

[01551 In a preferred embodiment, stable cell lines containing the constructs of interest 
are generated for high tharoughput screening. Such stable cells lines may be generated by 
introducing a reporter gene construct comprising a selectable marker, allowing the cells to 
grow for 1-2 days in an enriched medium, and then growing the cells on a selective 
medium. The selectable marker in the recombinant plasmid confers resistance to the 
selection and allows cells to stably integrate the plasmid into their chromosomes and grow 
to forni foci which in turn can be cloned and expanded into cell lines. 
[0156] A number of selection systems may be used, including but not limited to the 
herpes simplex virus thymidine kinase (see, e.g., Wigler et al., 1977, Cell 1 1 :223), 
hypoxanthine-guanine phosphoribosyltransferase (see, e.g., Szybalska & Szybalski, 1962, 
Proc. Natl. Acad. Sci. USA 48:2026), and adenine phosphoribosyltransferase (see, e.g., 
Lowy et al, 1980, Cell 22:817) genes can be employed in tk-, hgprt- or aprt- cells, 
respectively. Also, anti-metabolite resistance can be used as the basis of selection for dlifr, 
which confers resistance to methotrexate (see, e.g., Wigler et al., 1980, Natl. Acad. Sci. 
USA 77:3567 and O'Hare et al., 1981, Proc. Natl. Acad. Sci. USA 78:1527); gpt, which 
confers resistance to mycophenolic acid (see, e.g., Mulhgan & Berg, 1981, Proc. Natl. 
Acad. Sci. USA 78:2072); neo, which confers resistance to the aminoglycoside G-418 gene 
(see, e.g., Colberre-Garapin et al., 1981, J. Mol. Biol. 150:1); and hygro, which confers 
resistance to hygromycin genes (see, e.g., Santerre et al., 1984, Gene 30:147). 

5.2.4. Cell-Free Extracts 
[0157] The invention provides for the translation of the reporter gene constructs in a 
cell-free system. Techniques for practicing this specific aspect of this invention will 
employ, unless otherwise indicated, conventional techniques of molecular biology, 
microbiology, and recombinant DNA manipulation and production, which are routinely 
practiced by one of skill in the art. See, e.g., Sambrook, 1989, Molecular Cloning, A 
Laboratory Manual, Second Edition; DNA Cloning, Volumes I and n (Glover, Ed. 1985); 
and Transcription and Translation (Hames & Higgins, Eds. 1984). 

[0158] Any technique well-known to one of skill in the art may be used to generate cell- 
free extracts for translation in vitro (otherwise referred to herein as cell-free translation 

mixtm-es). For example, the cell-free extracts for in vitro translation reactions can be 
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generated by centrifiiging cells and clarifying the supernatant. The cell extracts for the 
present invention is about a SI (i.e., the supernatant from a 1,000 x g spin) to about a S500 
extract (i.e., the supernatant from a 500,000 x g spin), preferably about a SIO (i.e., the 
supernatant from a 10,000 x g spin) to S250 (i.e., the supernatant from a 250,000 x g spin) 
exfract. In some embodiments, about a S50 (i.e., the supernatant from a 50,000 x g spin) to 
SlOO (i.e., the supernatant from a 100,000 x g spin) exfract is preferred. 
[0159] The cell-free translation exfract may be isolated from cells of any species origin. 
For example, the cell-free franslation exfract may be isolated from human cells (e.g., HeLA 
cells), 293 cells, Vero cells, yeast, mouse cells (e.g., cultured mouse cells), rat cells (e.g., 
cultured rat cells), Chinese hamster ovary (CHO) cells, Xenopus oocytes, rabbit 
reticulocytes, primary cells, cancer cells (e.g., undifferentiated cancer cells), cell lines, 
wheat germ, rye embryo, or bacterial cell exfract (see, e.g., Krieg & Melton, 1984, Nature 
308:203 andDignam et al, 1990 Methods Enzymol. 182:194-203). Alternatively, the cell- 
free franslation extract, e.g., rabbit reticulocyte lysates and wheat germ exfract, can be 
purchased from, e.g., Promega, (Madison, WI). It is preferred that the cells from which the 
cell-free exfract is obtained do not endogenously express a target gene of interest, hi a 
preferred embodiment, the cell-free exfract is an exfract isolated from human cells. In a 
more preferred embodiment, the human cells are HeLa cells. 

5.3. Libraries of Compounds 
[0160] Libraries screened using the methods of the present invention can comprise a 
variety of types of compounds. Examples of libraries that can be screened in accordance 
with the methods of the invention include, but are not limited to, peptoids; random 
biooligomers; diversomers such as hydantoins, benzodiazepines and dipeptides; vinylogous 
polypeptides; nonpeptidal peptidomimetics; oUgocarbamates; peptidyl phosphonates; 
peptide nucleic acid libraries; antibody libraries; carbohydrate libraries; and small molecule 
libraries (preferably small organic molecules). In some embodiments, the compounds in the 
libraries screened are nucleic acid or peptide molecules. In a non-limiting example, peptide 
molecules can exist in a phage display library. In other embodiments, the types of 
compounds include, but are not limited to, peptide analogs including peptides comprising 
non-naturally occurring amino acids, e.g., D-amino acids, phosphorous analogs of amino 
acids, such as o!-amino phosphoric acids and a-amino phosphoric acids, or amino acids 
having non-peptide linkages, nucleic acid analogs such as phosphorothioates and PNAs, 
hormones, antigens, synthetic or naturally occurring drugs, opiates, dopamine, serotonin, 
catecholamines, thrombin, acetylcholine, prostaglandins, organic molecules, pheromones. 
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adenosine, sucrose, glucose, lactose and galactose. Libraries of polypeptides or proteins 
can also be used in the assays of the invention. 

[0161] In a preferred embodiment, the combinatorial libraries are small organic 
molecule libraries including, but not limited to, benzodiazepines, isoprenoids, beta 
carbalines, thiazohdinones, metathiazanones, pyrrolidines, morpholino compounds, and 
benzodiazepines. In another embodiment, the combinatorial libraries comprise peptoids; 
random bio-oUgomers; benzodiazepines; diversomers such as hydantoins, benzodiazepines 
and dipeptides, vinylogous polypeptides; nonpeptidal peptidomimetics; oUgocarbamates; 
peptidyl phosphonates; peptide nucleic acid libraries; antibody libraries; or carbohydrate 
libraries. Combinatorial libraries are themselves commercially available (see, e.g., 
ComGenex, Princeton, New Jersey; Asinex, Moscow, Ru, Tripos, Inc., St. Louis, Missouri; 
ChemStar, Ltd, Moscow, Russia; 3D Pharmaceuticals, Exton, Pennsylvania; Martek 
Biosciences, Columbia, Maryland; etc.). 

[0162] In a preferred embodiment, the library is preselected so that the compounds of 
the library are more amenable for cellular uptake. For example, compounds are selected 
based on specific parameters such as, but not limited to, size, lipophilicity, hydrophiUcity, 
and hydrogen bonding, which enhance the likelihood of compounds getting into the cells. 
In another embodiment, the compounds are analyzed by three-dimensional or four- 
dimensional computer computation programs. 

[0163] The combinatorial compound library for use in accordance with the methods of 
the present invention may be synthesized. There is a great interest in synthetic methods 
directed toward the creation of large collections of small organic compounds, or libraries, 
which could be screened for phannacological, biological or other activity. The synthetic 
methods applied to create vast combinatorial libraries are performed in solution or in the 
solid phase, i.e., on a solid support. Solid-phase synthesis makes it easier to conduct 
multi-step reactions and to drive reactions to completion with high yields because excess 
reagents can be easily added and washed away after each reaction step. Sohd-phase 
combinatorial synthesis also tends to improve isolation, purification and screening. 
However, the more traditional solution phase chemistry supports a wider variety of organic 
reactions than solid-phase chemistry. 

[01 64] Combinatorial compound libraries of the present invention may be synthesized 
using the apparatus described in U.S. Patent No. 6,190,619 to Kilcoin et al., which is hereby 
incorporated by reference in its entirety. U.S. Patent No. 6,190,619 discloses a synthesis 
apparatus capable of holding a plurality of reaction vessels for parallel synthesis of multiple 
discrete compounds or for combinatorial libraries of compounds. 
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[0165] In one embodiment, the combinatorial compound library can be synthesized in 
solution. The method disclosed in U.S. Patent No. 6,194,612 to Boger et al., which is 
hereby incorporated by reference in its entirety, features compounds useful as templates for 
solution phase synthesis of combinatorial libraries. The template is designed to permit 
reaction products to be easily purified fi:om unreacted reactants using liquid/liquid or 
sohd/liquid extractions. The compounds produced by combinatorial synthesis using the 
template will preferably be small organic molecules. Some compounds in the library may 
xnimic the effects of non-peptides or peptides. In contrast to solid phase synthesize of 
combinatorial compovind libraries, liquid phase synthesis does not require the use of 
specialized protocols for monitoring the individual steps of a multistep soUd phase synthesis 
(Egner et al., 1995, J.Org. Chem. 60:2652; Anderson et al., 1995, J. Org. Chem. 60:2650; 
Fitch et al., 1994, J. Org. Chem. 59:7955; Look et al., 1994, J. Org. Chem. 49:7588; 
Metzger et al., 1993, Angew. Chem., hit. Ed. Engl. 32:894; Youngquist et al, 1994, Rapid 
Commun. Mass Spect. 8:77; Chu et al., 1995, J. Am. Chem. Soc. 1 17:5419; Brummel et al., 
1994, Science 264:399; and Stevanovic et al., 1993, Bioorg. Med. Chem. Lett. 3:431). 
[0166] Combinatorial compound libraries useful for the methods of the present 
invention can be synthesized on solid supports. In one embodiment, a spHt synthesis 
method, a protocol of separating and mixing sohd supports during the synthesis, is used to 
synthesize a library of compounds on solid supports (see e.g., Lam et al., 1997, Chem. Rev. 
97:41-448; Ohlmeyer et al., 1993, Proc. Natl. Acad. Sci. USA 90:10922-10926 and 
references cited therein). Each sohd suppoil in the final library has substantially one type 
of compound attached to its surface. Other methods for synthesizing combinatorial libraries 
on sohd supports, wherein one product is attached to each support, will be known to those 
of skill in the art (see. e.g., Nefzi et al,, 1997, Chem. Rev. 97:449-472). 
[0167] As used herein, the term "solid support" is not limited to a specific type of sohd 
support. Rather a large number of supports are available and are known to one skilled in 
the art. Solid supports include silica gels, resins, derivatized plastic films, glass beads, 
cotton, plastic beads, polystyrene beads, alumina gels, and polysaccharides. A suitable 
sohd support may be selected on the basis of desired end use and suitability for various 
synthetic protocols. For example, for peptide synthesis, a sohd support can be a resin such 
as p-methylbenzhydrylamine (pMBHA) resin (Peptides International, Louisville, KY), 
polystyrenes (e.g., PAM-resin obtained from Bachem Inc., Peninsula Laboratories, etc.), 
including chloromethylpolystyrene, hydroxymethylpolystyrene and 
aminomethylpolystyrene, poly (dimethylacrylamide)-grafted styrene co-divinyl-benzene 
(e.g., POLYHIPE resin, obtained from Aminotech, Canada), polyamide resin (obtained 
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from Peninsula Laboratories), polystyrene resin grafted with polyethylene glycol (e.g., 
TENTAGEL or ARGOGEL, Bayer, Tubingen, Germany) polydimethylacrylamide resin 
(obtained from Milligen/Biosearch, CaUfomia), or Sepharose (Pharmacia, Sweden). 
[01 68] In some embodiments of the present invention, compounds can be attached to 
solid supports via linkers. Linkers can be integral and part of the solid support, or they may 
be nonintegral that are either synthesized on the solid support or attached thereto after 
synthesis. Linkers are useftil not only for providing points of compound attachment to the 
solid support, but also for allowing different groups of molecules to be cleaved from the 
solid support under different conditions, depending on the nature of the linker. For 
example, linkers can be, inter alia, electrophilically cleaved, nucleophilically cleaved, 
photocleavable, enzymatically cleaved, cleaved by metals, cleaved under reductive 
conditions or cleaved imder oxidative conditions. In a preferred embodiment, the 
compounds are cleaved from the solid support prior to high throughput screening of the 
compounds. 

5.4. Reporter Gene-Based Screening Assays 

5.4.1. Cell-Based Assays 
[0169] After a vector containing the reporter gene construct is transformed or 
transfected into a host cell and a compound library is synthesized or purchased or both, the 
cells are used to screen the library to identify compounds that modulate untranslated region- 
dependent expression of a target gene. In a prefeixed embodiment, the cells are stably 
transfected with the reporter gene construct. The reporter gene-based assays may be 
conducted by contacting a compound or a member of a library of compounds with a cell 
genetically engineered to express a nucleic acid comprising a reporter gene operably linked 
to one or more untranslated regions of a target gene, and measuring the expression of said 
reporter gene. The alteration in reporter gene expression relative to a previously determined 
reference range, the absence of the compound or a control in such reporter-gene based 
assays indicates that a particular compound modulates untranslated region-dependent 
expression of a target gene. In a preferred embodiment, a negative control (e.g. , PBS or 
another agent that is known to have no effect on the expression of the reporter gene) and a 
positive control (e.g., an agent that is known to have an effect on the expression of the 
reporter gene, preferably an agent that effects untranslated region-dependent expression) are 
included in the cell-based assays described herein. 

[0170] The step of contacting a compound or a member of a library of compounds with 
a cell genetically engineered to express a reporter gene operably linked to one or more 
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untranslated regions may be conducted under physiologic conditions. In specific 
embodiment, a compound or a member of a library of compounds is added to tlie cells in 
the presence of an aqueous solution. In accordance with this embodiment, the aqueous 
solution may comprise a buffer and a combination of salts, preferably approximating or 
mimicking physiologic conditions. Alternatively, the aqueous solution may comprise a 
buffer, a combination of salts, and a detergent or a surfactant. Examples of salts which may 
be used in the aqueous solution include, but not limited to, KCl, NaCl, and/or MgCla- The 
optimal concentration of each salt used in the aqueous solution is dependent on the cells and 
compounds used and can be determined using routine experimentation. 
[0171] The invention provides for contacting a compound or a member of a library of 
compounds with a cell genetically engineering to express a reporter gene operably Hnked to 
one or more untranslated regions for a specific period of time. For example, the contacting 
can take place for about 1 minute, 2 mmutes, 3 mmutes, 4, minutes, 5, minutes, 10 minutes, 
15 ininutes, 20 minutes, 30 mmutes, 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 10 hours, 15 
hours, 20 hours, 1 day, 2 days, 3 days, 4 days, 5 days, or 1 week. Li a preferred 
embodiment, the contacting is about 15 hours, z.e., overnight. The contacting can take place 
for about 1 minute to 1 week, preferably about 5 minutes to 5 days, more preferably about 
10 minutes to 2 days, and even more preferably about 1 hour to 1 day. 
[0172] In one embodiment, the invention provides a method for identifying a compound 
that modulates untranslated region-dependent expression of a target gene, said method 
comprising: (a) expressing a nucleic acid comprising a reporter gene operably hnked to one 
or more untranslated regions of said target gene in a cell; (b) contacting said cell with a 
member of a library of compounds; and (c) detectuig the expression of said reporter gene, 
wherein a compound that modulates untranslated region-dependent regulation of expression 
is identified if the expression of said reporter gene in the presence of a compound is ahered 
relative to a previously determined reference range or the expression of said reporter gene 
in the absence of said compound or the presence of a control (e.g., phosphate buffered 
saline ("PBS")), hi another embodiment, the invention provides a method for identifying a 
compound that modulates untranslated region-dependent expression of a target gene, said 
method comprising: (a) contacting a member of a library of compounds with a cell 
containing a nucleic acid comprising a reporter gene operably linked to one or more 
untranslated regions of said target gene; and (b) detecting a reporter protein translated from 
said reporter gene, wherein detecting the expression of said reporter gene, wherein a 
compound that modulates untranslated region-dependent expression is identified if the 
expression of said reporter gene in the presence of a compound is altered relative to a 
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previously detenruned reference range or the expression of said reporter gene in the absence 
of said compound or the presence of a control (e.g., PBS). 

[0173] The invention also provides methods of identifying compounds that upregulate 
or down-regulate untranslated region-dependent expression of a target gene utlitizing the 
cell-based reporter gene assays described herein, hi a specific embodunent, the invention 
provides a method of upregulating untranslated region-dependent expression of a target 
gene, said method comprising (a) contacting a compound with a cell containing a nucleic 
acid comprising a reporter gene operably hnked to one, two, three or more untranslated 
regions of said target gene; and (b) detecting a reporter gene protein translated from said 
reporter gene, wherein a compound that upregulates untranslated region dependent 
expression is identified if the expression of said reporter gene in the presence of a 
compound is increased relative a previously determined reference range, or the expression 
in the absence of the compound or the presence of a control (e.g., PBS). In another 
embodiment, the invention provides a method of down-regulating untranslated region- 
dependent expression of a target gene, said method comprising (a) contacting a compound 
with a cell containing a nucleic acid comprising a reporter gene operably linlced to one, two, 
three or more untranslated regions of said target gene; and (b) detecting a reporter gene 
protein translated from said reporter gene, wherein a compound that down-regulates 
untranslated region dependent expression is identified if the expression of said reporter gene 
m the presence of a compound is decreased relative a previously determined reference 
range, or the expression in the absence of the compound or the presence of a control (e.g. , 
PBS). 

[0174] The present invention provides methods of identifying environmental stimuli 
{e.g., exposure to different concenti'ations of CO2 and/or 02,stress and different pHs) that 
modulate untranslated region-dependent expression of a target gene utilizing the cell-based 
reporter gene assays described herein, hi particular, the invention provides a method of 
identifying an environmental stimulus, said method comprising (a) contacting a cell 
containing a nucleic acid comprising a reporter gene operably linked to one, two, three or 
more untranslated regions of said target gene with an environmental stimulus; and (b) 
detecting a reporter gene protein translated from said reporter gene, wherein a compound 
that modulates untranslated region dependent expression is identified if the expression of 
said reporter gene in the presence of an environmental stimuli is altered relative to a 
previously determined reference range, or the expression in the absence of the compound or 
the presence of a control {e.g., PBS). In a specific embodiment, the environmental stimuli 
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is not hypoxia. In aaother embodiment, the environmental stimuli does not include a 
compound. 

[0175] The expression of a reporter gene in the cell-based reporter-gene assays may be 
detected by any technique well-known to one of skill in the art. Methods for detecting the 
expression of a reporter gene will vary with the reporter gene used. Assays for the various 
reporter genes are well-known to one of skill in the art. For example, as described in 
Section 5.2.1., luciferase, beta-galactosidase ("b-gal"), beta-glucoronidase ("GUS"), beta- 
lactamase, chloramphenicol acetyltransferase ("CAT"), and alkaline phosphatase ("AP") 
are enzymes that can be analyzed in the presence of a substrate and could be amenable to 
high throughput screening. For example, the reaction products of luciferase, beta- 
galactosidase ("b-gal"), and alkaline phosphatase ("AP") are assayed by changes in light 
imaging {e.g., luciferase), spectrophotometric absorbance (e.g-., b-gal), or fluorescence (e.g., 
AP). Assays for changes in Ught output, absorbance, and/or fluorescence are easily adapted 
for high throughput screening. For example, b-gal activity can be measured with a 
microplate reader. Green fluorescent protein ("GFP") activity can be measured by changes 
in fluorescence. For example, in the case of mutant GFPs that fluoresce at 488 rnn, 
standard fluorescence activated cell sorting ("FACS") equipment can be used to separate 
cells based upon GFP activity. 

[0176] Alterations in the expression of a reporter gene may be determined by 
comparing the level of expression of the reporter gene to a negative control (e.g., PBS or 
another agent that is known to have no effect on the expression of the reporter gene) and 
optionally, a positive control (e.g., an agent that is known to have an effect on the 
expression of the reporter gene, preferably an agent that effects untranslated region- 
dependent expression). Alternatively, alterations in the expression of a reporter gene may 
be determined by comparing the level of expression of the reporter gene to a previously 
determined reference range. 

5.4.2. Cell-Free Assays 

[0177] After a vector containing the reporter gene construct is produced, a cell-free 

translation extract is generated or purchased, and a compound library is synthesized or 

purchased or both, the cell-free translation extract and nucleic acid are used to screen the 

Ubraxy to identify compounds that modulate untranslated region-dependent expression of a 

target gene. The reporter gene-based assays may be conducted in a cell-free manner by 

contacting a compound or a member of a library of compounds with a cell-free translation 

mixture and a nucleic acid comprising a reporter gene operably linked to one or more 

untranslated regions of a target gene, and measuring the expression of said reporter gene. 
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The alteration in reporter gene expression relative to a previously determined reference 
range, the absence of a compound or a control in such reporter-gene based assays indicates 
that a particular compound modulates untranslated region-dependent expression of a target 
gene. In a preferred embodiment, a negative control {e.g., PBS or another agent that is 
known to have no effect on the expression of the reporter gene) and a positive control (e.g., 
an agent that is known to have an effect on the expression of the reporter gene, preferably 
an agent that effects untranslated region-dependent expression) are included in the cell-free 
assays described herein. 

[0178] The step of contacting a compound or a member of a library of compounds with 
a cell-free translation mixture containing a nucleic acid comprising a reporter gene operably 
linked to one or more untranslated regions may be conducted under conditions 
approximating or mimicking physiologic conditions, hi specific embodiment, a compound 
or a member of a library of compounds is added to the cells in the presence of an aqueous 
solution, hi accordance with this embodiment, the aqueous solution may comprise a buffer 
and a combmation of salts, preferably approximating or mimicking physiologic conditions. 
Alternatively, the aqueous solution may comprise a buffer, a combination of salts, and a 
detergent or a surfactant. Examples of salts which may be used in the aqueous solution 
include, but not limited to, KCl, NaCl, and/or MgCl2. The optunal concentration of each 
saU used in the aqueous solution is dependent on the cells and compounds used and can be 
determined using routine experimentation. 

[0179] The invention provides for contacting a compound or a member of a library of 
compounds with a cell genetically engineering to express a reporter gene operably hnked to 
one or more untranslated regions for a specific period of time. For example, the contacting 
can talte place for about 1 minute, 2 minutes, 3 minutes, 4, minutes, 5, minutes, 10 minutes, 
15 minutes, 20 minutes, 30 minutes, 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 10 hours, 15 
hours, 20 hours, 1 day, 2 days, 3 days, 4 days, 5 days, or 1 week, hi a preferred 
embodiment, the contacting is about 15 hours, i.e., overnight. The contacting can take place 
for about 1 minute to 1 week, preferably about 5 minutes to 5 days, more preferably about 
10 minutes to 2 days, and even more preferably about 1 hour to 1 day. 
[01 80] hi a specific embodiment, the invention provides a method for identifying a 
compound that modulates untranslated region-dependent expression of a target gene, said 
method comprising: (a) contacting a member of a library of compounds with a cell-fi-ee 
translation mixture and a nucleic acid comprising a reporter gene operably linked to one or 
more untranslated regions of said target gene; and (b) detecting the expression of said 
reporter gene, whereui a compound that modulates untranslated region-dependent 
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expression is identified if the expression of said reporter gene in the presence of a 
compound is altered relative to a previously determined reference range or the expression of 
said reporter gene in the absence of said compound or the presence of a control {e.g., PBS). 
[0181] The invention also provides methods of identifying compounds that upregulate 
or down-regulate untranslated region-dependent expression of a target gene utlitizing the 
cell-Jfree reporter gene assays described herein. In a specific embodiment, the invention 
provides a method of upregulating untranslated region-dependent expression of a target 
gene, said method comprising (a) contacting a compound with a cell-free translation 
mixture and a nucleic acid comprising a reporter gene operably linked to one, two, three or 
more untranslated regions of said target gene; and (b) detecting a reporter gene protein 
translated from said reporter gene, wherein a compound that upregulates untranslated region 
dependent expression is identified if the expression of said reporter gene in the presence of 
a compound is increased relative a previously determined reference range, or the expression 
in the absence of the compound or the presence of a control (e.g., PBS). In another 
embodiment, the invention provides a method of down-regulating untranslated region- 
dependent expression of a target gene, said method comprising (a) contacting a compound 
with a cell-free translation mixture and a nucleic acid comprising a reporter gene operably 
linked to one, two, three or more untranslated regions of said target gene; and (b) detecting 
a reporter gene protein translated from said reporter gene, wherein a compound that down- 
regulates untranslated region dependent expression is identified if the expression of said 
reporter gene in the presence of a compound is decreased relative to a previously 
determined reference range, or the expression ia the absence of the compound or the 
presence of a control (e.g., PBS). 

[01821 The activity of a compound in the in vitro translation mixture can be determined 
by assaying the activity of a reporter protein encoded by a reporter gene, or alternatively, by 
quantifying the expression of the reporter gene by, for example, labeling the in vitro 
translated protein {e.g., with ■'^S-labeled methionine), northern blot analysis, RT-PCR or by 
immunological methods, such as western blot analysis or immunoprecipitation. Such 
methods are well-known to one of skill in the art. 

5.4.3. Direct Binding Assays 

[0183] Compounds that modulate untranslated region-dependent expression of a target 

gene can be identified by direct binding assays. Li this embodiment, the target RNA 

comprises one or more xmtranslated regions, and preferably contains at least one element of 

an unfranslated region. Such assays are described in Memational Patent Pubhcation Nos. 

WO 02/083837 and WO 02/083953, the disclosures of which are hereby incorporated by 
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reference in their entireties. Briefly, direct binding assays may be conducted by attaching a 
library of compounds to sohd supports, e.g., polymer beads, with each sohd support having 
substantially one type of compound attached to its surface. The plurality of solid supports 
of the library is exposed in aqueous solution to target RNA having a detectable label, 
forming a dye-labeled target RNA:support-attached compound complex. Binding of a 
target RNA molecule to a particular compound labels the sohd support, e.g., bead, 
comprising the compound, which can be physically separated from other, unlabeled sohd 
supports. Once labeled sohd supports are identified, the chemical structures of the 
compounds thereon can be determined by, e.g., by reading a code on the sohd support that 
correlates with the structure of the attached compound. 

[0184] Alternatively, direct binding assays may be conducted by contacting a target 
RNA having a detectable label with a member of a library of compounds free in solution, in 
labeled tubes or microliter wells, or amicroarray. Compounds in the library that bind to the 
labeled target RNA wiU form a detectably labeled complex that can be identified and 
removed fi"om the uncomplexed, unlabeled compounds in the library, and from 
uncomplexed, labeled target RNA, by a variety of methods including, but not limited to, 
methods that differentiate changes in the electrophoretic, chromatographic, or thennostable 
properties of the complexed target RNA. 

5.4.3.1. Electrophoresis 
[0185] Methods for separation of the complex of a target RNA bound to a compound 
from the unbound RNA comprises any method of elecfrophoretic separation, including but 
not limited to, denaturing and non-denaturing polyacrylamide gel elecfrophoresis, urea gel 
electrophoresis, gel filtration, pulsed field gel electrophoresis, two dimensional gel 
electrophoresis, continuous flow electrophoresis, zone electrophoresis, agarose gel 
electrophoresis, and capillary electrophoresis. 

[0186] In a preferred embodiment, an automated electrophoretic system comprising a 
capillary cartridge having a plurahty of capillary tubes is used for high-throu^put 
screening of compounds bound to target RNA. Such an apparatus for performing 
automated capillary gel electrophoresis is disclosed in U.S. Patent Nos. 5,885,430; 
5,916,428; 6,027,627; and 6,063,251, the disclosures of which are incorporated by reference 
in their entireties. 

[0187] The device disclosed in U.S. Patent No. 5,885,430, which is incorporated by 

reference in its entirety, allows one to simultaneously introduce samples into a plurality of 

capillary tubes directly from microtiter frays having a standard size. U.S. Patent No. 

5,885,430 discloses a disposable capillary cartridge which can be cleaned between 
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electrophoresis runs, the cartridge having a plurality of capillary tubes. A first end of each 
capillary tube is retained in a mounting plate, the first ends collectively forming an array in 
the mounting plate. The spacing between the first ends corresponds to the spacing between 
the centers of the wells of a microtiter tray having a standard size. Thus, the first ends of 
the capillary tubes can simultaneously be dipped into the samples present in the tray's 
wells. The cartridge is provided with a second mounting plate in which the second ends of 
the capillary tubes are retained. The second ends of the capillary tubes are arranged in an 
array which corresponds to the wells in the microtiter tray, which allows for each capillary 
tube to be isolated from its neiglibors and therefore free from cross-contamination, as each 
end is dipped into an individual well. 

[0188] Plate holes may be provided in each mounting plate and the capillary tubes 
inserted through these plate holes, hi such a case, the plate holes are sealed airtight so that 
the side of the mounting plate having ttie exposed capillary ends can be pressurized. 
Apphcation of a positive pressure in the vicinity of the capillary openings in this mounting 
plate allows for the introduction of air and fluids during electrophoretic operations and also 
can be used to force out gel and other materials from the capillary tubes during 
reconditioning. The capillary tubes may be protected fi-om damage using a needle 
comprising a cannula and/or plastic tubes, and the like when they are placed in these plate 
holes. Wlien metallic cannula or the like are used, they can serve as electrical contacts for 
current flow during electrophoresis. In the presence of a second mounting plate, the second 
mounting plate is provided with plate holes through which the second ends of the capillary 
tubes project, hi this instance, the second mounting plate serves as a pressure containment 
member of a pressure cell and the second ends of the capillary tubes communicate with an 
internal cavity of the pressure ceU. The pressure cell is also formed with an inlet and an 
outiet. Gels, buffer solutions, cleaning agents, and the like may be introduced into the 
internal cavity through the inlet, and each of these can simultaneously enter the second ends 
of the capillaries. 

[0189] In another preferred embodiment, the automated elecfrophoretic system can 
comprise a chip system consisting of complex designs of interconnected channels that 
perform and analyze enzyme reactions using part of a channel design as a tiny, continuously 
operating electrophoresis material, where reactions with one sample are going on in one 
area of the chip while electrophoretic separation of the products of another sample is taking 
place in a different part of the chip. Such a system is disclosed in U.S. Patent Nos. 
5,699,157; 5,842,787; 5,869,004; 5,876,675; 5,942,443; 5,948,227; 6,042,709; 6,042,710; 
6,046,056; 6,048,498; 6,086,740; 6,132,685; 6,150,119; 6,150,180; 6,153,073; 6,167,910; 
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6,171,850; and 6,186,660, the disclosures of which are incorporated by reference in their 
entireties. 

[0190] The system disclosed in U.S. Patent No. 5,699,157, which is hereby 
incorporated by reference in its entirety, provides for a microfluidic system for high-speed 
electrophoretic analysis of subject materials for applications in the fields of chemistry, 
biochemistry, biotechnology, molecular biology and numerous other areas. The system has 
a channel in a substrate, a hght source and a photoreceptor. The channel holds subject 
materials in solution in an electric field so that the materials move tlirough the channel and 
separate into bands according to species. The hght source excites fluorescent Ught in the 
species bands and the photoreceptor is arranged to receive the fluorescent light from the 
bands. The system fiirther has a means for maskmg the channel so that the photoreceptor 
can receive the fluorescent light only at periodically spaced regions along the channel. The 
system also has an unit connected to analyze the modulation frequencies of hght intensity 
received by the photoreceptor so that velocities of the bands along the channel are 
determined, which allows the materials to be analyzed. 

[0191] The system disclosed in U.S. Patent No, 5,699,157 also provides for a method of^^ 
performing high-speed electrophoretic analysis of subject materials, which comprises the 
steps of holding the subject materials in solution in a channel of a microfluidic system; 
subjecting the materials to an electric field so that the subject materials move through the 
channel and separate into species bands; directing light toward the chamiel; receiving hght 
from periodically spaced regions along the channel simultaneously; and analyzing the 
frequencies of hght intensity of the received light so that velocities of the bands along the 
channel can be determined for analysis of said materials. The determination of the velocity 
of a species band determines the electrophoretic mobility of the species and its 
identification. 

[0192] U.S. Patent No. 5,842,787, which is hereby incorporated by reference in its 
entirety, is generally directed to devices and systems employ channels having, at least in 
part, depths that are varied over those which have been previously described (such as the 
device disclosed in U.S. Patent No. 5,699,157), wherein said channel depths provide 
numerous beneficial and unexpected results such as but not limited to, a reduction in sample 
perturbation, reduced non-specific sample mixture by diffusion, and increased resolution. 
[0193] In another embodiment, the electrophoretic method of separation comprises 
polyacrylamide gel elecfrophoresis. In a prefen-ed embodiment, the polyacrylamide gel 
electrophoresis is non-denaturing, so as to differentiate the mobilities of the target RNA 
bound to a compound from free target RNA. If the polyacrylamide gel electrophoresis is 
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denaturing, then the target RNAicompound complex must be cross-linked prior to 
electrophoresis to prevent the disassociation of the target RNA from the compound during 
electrophoresis. Such techniques are well known to one of skill in the art. 
[0194] In one embodiment of the method, the binding of compounds to target nucleic 
acid can be detected, preferably in an automated fashion, by gel electrophoretic analysis of 
interference footprinting. RNA can be degraded at specific base sites by enzymatic 
methods such as ribonucleases A, U2, CL3, Ti, Phy M, and B. cereus or chemical methods 
such as diethylpyrocarbonate, sodium hydroxide, hydrazine, piperidine fonnate, dimethyl 
sulfate, [2,12-dimethyl-3,7,ll,17-teti-aazacyclo[11.3.1]heptadeca-l(17),2,ll,13,15- 
pentaenato] nickel(II) (NiCR), cobalt(II)chloride, or iron(n) ethylenediaminetetraacetate 
(Fe-EDTA) as described for example in Zheng et al., 1999, Biochem. 37:2207-2214; 
Latham & Cecil, 1989, Science 245:276-282; and Sambrook et al., 2001, in Molecular 
Cloning: A Laboratory Manual, pp 12.61-12.73, Cold Spring Harbor Laboratory Press, and 
the references cited therein, which are hereby incorporated by reference in their entireties. 
The specific pattern of cleavage sites is determined by the accessibility of particular bases 
to the reagent employed to initiate cleavage and, as such, is therefore is determined by the 
three-dimensional structure of the RNA. 

[0195] The interaction of small molecules with a target nucleic acid can change the 
accessibility of bases to these cleavage reagents both by causing conformational changes in 
the target nucleic acid or by covering abase at the binding interface. Wlien a compound 
binds to the nucleic acid and changes the accessibility of bases to cleavage reagents, the 
observed cleavage pattern will change. This method can be used to identify and 
characterize the binding of small molecules to RNA as described, for example, by Prudent 
et al, 1995, J. Am. Chem. Soc. 117:10145-10146 andMei et al., 1998, Biochem. 
37:14204-14212. 

[0196] In the preferred embodiment of this technique, the detectably labeled target 
nucleic acid is incubated with an individual compound and then subjected to treatment with 
a cleavage reagent, either enzymatic or chemical. The reaction mixture can be preferably 
be examined directly, or treated further to isolate and concentrate the nucleic acid. The 
fi-agments produced are separated by electrophoresis and the pattem of cleavage can be 
compared to a cleavage reaction perfonned in the absence of compound. A change in the 
cleavage pattem directly indicates that the compound binds to the target nucleic acid. 
Multiple compoimds can be examined both in parallel and serially. 
[0197] Other embodiments of electrophoretic separation include, but are not limited to 
mea gel electrophoresis, gel filtration, pulsed field gel electrophoresis, two dimensional gel 
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electrophoresis, continuous flow electrophoresis, zone electrophoresis, and agarose gel 
electrophoresis. 



5.4.3.2. Size Exclusion Chromatography 

[0198] In another embodiment of the present invention, size-exclusion chromatography 
is used to purify compounds that are bound to a target nucleic acid from a complex mixture 
of compounds. Size-exclusion chromatography separates molecules based on their size and 
uses gel-based media comprised of beads with specific size distributions. When applied to 
a column, this media settles into a tightly packed matrix and forms a complex array of 
pores. Separation is accomplished by the inclusion or exclusion of molecules by these 
pores based on molecular size. Small molecules are included into the pores and, 
consequently, their migration through the matrix is retarded due to the added distance they 
must travel before elution. Large molecules are excluded from the pores and migrate with 
the void volume when applied to the matrix, hi the present invention, a target nucleic acid 
is incubated with a mixture of compounds while free in solution and allowed to reach 
equilibrium. When applied to a size exclusion column, compounds free iu solution are 
retained by the column, and compounds bound to the target nucleic acid are passed tiborough 
the column, hi a preferred embodiment, spin columns commonly used for gel filtration of 
nucleic acids will be employed to separate bound from imbound compounds (e.g., Bio-Spin 
columns manufactured by BIO-RAD). hi another embodiment, the size exclusion matrix is 
packed into multiwell plates to allow high throughput separation of mixtures (e.g., 
PLASMID 96-well SEC plates manufactured by Millipore). 

5.4.3.3. Affinity Chromatography 

[0199] hi one embodiment of the present invention, affinity capture is used to purify 
compounds that are bound to a target nucleic acid labeled with an affinity tag from a 
complex mixture of compounds. To accomplish this, a target nucleic acid labeled with an 
affinity tag is incubated with a mixture of compounds while free in solution and then 
captured to a solid support once equihbrium has been established; alternatively, target 
nucleic acids labeled with an affinity tag can be captured to a sohd support first and then 
allowed to reach equilibrium with a mixture of compounds. 

[0200] The solid support is typically comprised of, but not limited to, cross-linlced 
agarose beads that are coupled with a ligand for the affinity tag. Alternatively, the soUd 
support may be a glass, silicon, metal, or carbon, plastic (polystyrene, polypropylene) 
surface with or without a self-assembled monolayer (SAM) either -with a covalently 
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attacned ligand for the affinity tag, or with inherent affinity for the tag on the target nucleic 
acid. 

[0201] Once the complex between the target nucleic acid and compound has reached 
equilibrium and has been captured, one skilled in the art will appreciate that the retention of 
bound compounds and removal of unbound compounds is facilitated by washing the solid 
support with large excesses of bmding reaction buffer. Furthermore, retention of high 
affinity compounds and removal of low affinity compounds can be accomplished by a 
number of means that increase the stringency of washing; these means include, but are not 
limited to, increasing the number and duration of washes, raising the salt concentration of 
the wash buffer, addition of detergent or surfactant to the wash buffer, and addition of 
non-specific competitor to the wash buffer. 

[0202] In one embodiment, the compoimds themselves are detectably labeled with 
fluorescent dyes, radioactive isotopes, or nanoparticles. When the compounds are applied 
to the captured target nucleic acid in a spatially addressed fashion (e.g., in separate wells of 
a 96-well microplate), bindmg between the compounds and the target nucleic acid can be 
determined by the presence of the detectable label on the compound using fluorescence, 
[0203] Following the removal of mbound compounds, bound compounds with high 
affinity for the target nucleic acid can be eluted from the immobilized target nucleic acids 
and analyzed. The elation of compounds can be accomphshed by any means that break the 
non-covalent interactions between the target nucleic acid and compound. Means for elution 
uiclude, but are not limited to, changing the pH, changing the salt concentration, the 
application of organic solvents, and the application of molecules that compete with the 
bound ligand. In a prefen'ed embodiment, the means employed for elution will release the 
compound from the target RNA, but will not effect the interaction between the affinity tag 
and the solid support, thereby achieving selective elution of compound. Moreover, a 
preferred embodiment will employ an elution buffer that is volatile to allow for subsequent 
concentration by lyophilization of the eluted compound (e.g., 0 M to 5 M ammonium 
acetate). 

5.5. Methods for Confirming that a Compound 

Modulates Untranslated Region-Dependent Expression 

[0204] In order to exclude the possibility that a particular compound is fianctioning 

solely by modulating the expression of a target gene in an untranslated region-iudependent 

manner, one or more mutations may be introduced into the untranslated regions operably 

linked to a reporter gene and the effect on the expression of the reporter gene in a reporter 

gene-based assay described herein can be determined. For example, a reporter gene 
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construct comprising the 5' UTR of a target gene maybe mutated by deleting a fragment of 
the 5' UTR of the target gene or substituting a fragment of the 5' UTR of the target gene 
with a fragment of the 5' UTR of another gene and measuring the expression of the reporter 
gene in the presence and absence of a compound that has been identified in a screening 
assays described supra (See Section 5.4). If the deletion of a fragment of the 5' UTR of the 
target gene or the substitution of a fragment of the 5' UTR of the target gene with a 
fragment of the 5' UTR of another gene affects the ability of the compound to modulate the 
expression of the reporter gene, then the fragment of the 5' UTR deleted or substituted 
plays a role in the regulation of the reporter gene expression and the regulation, at least in 
part, in an untranslated region-dependent manner. 

[0205] The possibility that a particular compound is fijnctioning solely by modulating 
the expression of a target gene in an untranslated region-independent manner may be also 
determined by changing the vector utilized as a reporter construct. The untranslated regions 
flanked by a reporter gene from the first reporter construct in which an effect on reporter 
gene expression was detected following exposure to a compound may be inserted into a 
new reporter construct that has, e.g., different franscriptional regulation elements {e.g., a 
different promoter) and a different selectable marker. The level of reporter gene expression 
in the presence of the compound can be compared to the level of reporter gene expression in 
the absence of the compound or in the presence of a confrol {e.g. , PBS). If there is no 
change in the level of expression of the reporter gene in the presence of the compound 
relative to the absence of the compound or in the presence of a confrol, then the compound 
probably is frinctioning in an untranslated region-independent manner. 
[02061 The specificity of a particular compound' s effect on unfranslated region- 
dependent expression of a target gene can also be detemined. In particular, the effect of a 
particular compound on the expression of one or more genes (preferably, a plurality of 
genes) can be determined utihzing assays well-lmown to one of skill in the art or described 
herein. In a specific embodiment, the specificity of a particular compound for an 
untranslated region of a target gene is determined by (a) contacting the compound of 
interest with a cell containing a nucleic acid comprising a reporter gene operably linked to 
an UTR of a different gene {i.e., a gene different from the target gene which has a UTR 
different from the target gene); and (b) detecting a reporter gene protein franslated from the 
reporter gene, wherein the compound is specific for the untranslated region of the target 
gene if the expression of said reporter gene in the presence of the compound is not altered 
or is not substantially altered relative to a previously determined reference range, or the 
expression in the absence of the compoimd or the presence of a confrol {e.g., PBS). In 
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another embodiment, the specificity of a particular compound for an untranslated region of 
a target gene is determined by (a) contacting the compound of interest with a panel of cells, 
each cell in a different well of a container (e.g., a 48 or 96 well microtiter plate) and each 
cell containing a nucleic acid comprising a reporter gene operably linked to an UTR of a 
different gene; and (b) detecting a reporter gene protein translated from the reporter gene, 
whereiQ the compound is specific for the untranslated region of the target gene if the 
expression of said reporter gene in the presence of the compound is not altered or is not 
substantially altered relative to a previously determined reference range, or the expression 
in the absence of the compound or the presence of a control (e.g., PBS), hi accordance with 
this embodiment, the panel may comprise 5, 7, 10, 15, 20, 25, 50, 75, 100 or more cells. In 
another embodiment, the specificity of a particular compound for an untranslated region of 
a target gene is determined by (a) contacting the compound of interest with a cell-free 
translation mixture and a nucleic acid comprising a reporter gene operably linked to an 
UTR of a different gene; and (b) detecting a reporter gene protein translated from the 
reporter gene, wherein the compound is specific for the untranslated region of the target 
gene if the expression of said reporter gene in the presence of the compound is not altered 
or is not substantially altered relative to a previously determined reference range, or the 
expression in the absence of the compound or the presence of a control (e.g., PBS). As 
used herein, the tenn "not substantially altered" means that the compound alters the 
expression of the reporter gene or target gene by less than 20%, less than 15%, less than 
10%, less than 5%, or less than 2% relative to a negative control such as PBS. 
[0207] The compounds identified in the assays described supra tliat modulate 
untranslated region-dependent expression of a target gene (for convenience refen-ed to 
herein as a "lead" compound) can be further tested for imtranslated region-dependent 
binding to the target RNA (which contains at least one untranslated region, and preferably 
at least one element of an untranslated region). Furthermore, by assessing the effect of a 
compound on target gene expression, cis-SLCtmg elements, i.e., specific nucleotide 
sequences, that are involved in untranslated region-dependent expression may be identified. 

5.5.1. RNA Binding Assays 
[0208] The compounds that modulate untranslated region-dependent expression of a 

target gene can be tested for binding to the target RNA (which contains at least one 

untranslated region, and preferably at least one element of an untranslated region) by any 

method known in the art. See Section 5.4.3 supra. 
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5.5.1. Subtraction Assay 
[0209] The element(s) of an untranslated region(s) that is necessary for a compound 
identified in accordaQce with the methods of the invention to modulate untranslated region- 
dependent expression of a target gene can be determined utilizing standard mutagenesis 
techniques well-known to one of skill in the art. One or more mutations (e.g., deletions, 
additions and/or substitutions) maybe introduced into the untranslated regions operably 
linked to a reporter gene and the effect on the expression of the reporter gene in a reporter 
gene-based assay described herein can be determined. For example, a reporter gene 
construct comprising the 5' UTR of a target gene may be mutated by deleting a fragment or 
all of the 5' UTR of the target gene or substituting a fragment of the 5' UTR of the target 
gene with a fragment of the 5' UTR of another gene and measuring the expression of the 
reporter gene in the presence and absence of a compound that has been identified in a 
screening assays described ^wpra (See Section 5.4). If the deletion of a fragment of the 5' 
UTR of the target gene or the substitution of a fragment of the 5 ' UTR of the target gene 
with a fragment of the 5' UTR of another gene affects the ability of the compound to 
modulate the expression of the reporter gene, then the fragment of the 5' UTR deleted or 
substituted plays a role in the regulation of the reporter gene expression. 
[0210] Standard techniques known to those of skill in the art can be used to introduce 
mutations in the nucleotide sequence of an untranslated region of a target gene, including, 
for example, site-directed mutagenesis and PCR-mediated mutagenesis, hi a specific 
embodiment, less than 75 nucleic acid residue substitutions, less than 50 nucleic acid 
residue substitutions, less than 45 nucleic acid residue substitutions, less than 40 nucleic 
acid residue substitutions, less than 35 nucleic acid residue substitutions, less than 30 
nucleic acid residue substitutions, less than 25 nucleic acid residue substitutions, less than 
20 nucleic acid residue substitutions, less than 15 nucleic acid residue substitutions, less 
than 10 nucleic acid residue substitutions, or less than 5 nucleic acid residue substitutions 
are introduced into the nucleotide sequence of an untranslated region of a target gene. In 
another embodiment, less than 10 elements of an untranslated region of a target gene, less 
than 9 of an untranslated region of a target gene, less than 8 elements of an untranslated 
region of a target gene, less than 7 elements of an untranslated region of a target gene, less 
than 6 elements of an untranslated region of a target gene, less than 5 elements of an 
untranslated region of a target gene, less than 4 elements of an untranslated region of a 
target gene, less than 3 elements of an untranslated region of a target gene, or less than 2 
elements of an untranslated region of a target gene are mutated at one time. 

5.5.3. Expressed Protein Concentration and Activity Assays 
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[0211] The compounds identified in the reporter gene-based assays described herein 
that modulate untranslated region-dependent expression may be tested in in vitro assays 
(e.g., cell-free assays) or in vivo assays (e.g., cell-based assays) well-known to one of skill 
in the art or described herein for the effect of said compounds on the expression of the 
target gene from which the untranslated regions of the reporter gene construct were derived. 
The specificity of a particular compound's effect on unfranslated region-dependent 
expression of one or more other genes (preferably, a plurality of genes) can also be 
determined utilizing assays well-known to one of skill in the art or described herein. In a 
preferred embodiment, a compound identified utilizing the reporter gene-based assays 
described herein has a specific effect on the expression of only one gene or a group of genes 
within the same signaling pathway. 

[0212] The expression of a gene can be readily detected, e.g. , by quantifying the protein 
and/or RNA encoded by said gene. Many methods standard in the art can be thus 
employed, including, but not limited to, immunoassays to detect and/or visuahze gene 
expression (e.g., western blot, immunoprecipitation followed by sodium dodecyl sulfate 
polyiacrylamide gel electrophoresis (SDS-PAGE), immunocytochemistry, etc.) and/or 
hybridization assays to detect gene expression by detecting and/or visuaUzing respectively 
niRNA encoding a gene (e.g., northern assays, dot blots, w? situ hybridization, etc.). Such 
assays are routine and well laiown in the art (see, e.g., Ausubel et al, eds, 1994, Current 
Protocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc., New York, which is 
incorporated by reference herein in its entirety). Exemplary immunoassays are described 
briefly below (but are not intended by way of limitation). 

[0213] hnraunoprecipitation protocols generally comprise lysing a population of cells in 
a lysis buffer such as RIPA buffer (1% NP-40 or Triton X-100, 1% sodium deoxycholate, 
0.1% SDS, 0.15 M NaCl, 0.01 M sodium phosphate at pH 7.2, 1% Trasylol) supplemented 
with protein phosphatase and/or protease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium 
vanadate), adding the antibody of interest to the cell lysate, incubating for a period of time 
(e.g., 1 to 4 hours) at 40° C, adding protein A and/or protein G sepharose beads to the cell 
lysate, incubating for about an hour or more at 40° C, washing the beads in lysis buffer and 
resuspendmg the beads in SDS/sample buffer. The abihty of the antibody of interest to 
immunoprecipitate a particular antigen can be assessed by, e.g., western blot analysis. One 
of skill in the art would be loiowledgeable as to the parameters that can be modified to 
increase the binding of the antibody to an antigen and decrease the background (e.g., pre- 
clearing the cell lysate with sepharose beads). For florther discussion regarding 



-83- 



wo 2004/065561 PCT/US2004/001643 

immunoprecipitation protocols see, e.g., Ausubel et al, eds, 1994, Current Protocols in 
Molecular Biology, Vol. 1, John Wiley & Sons, Inc., New York at 10.16.1. 
[0214] Western blot analysis generally comprises preparing protein samples, 
electrophoresis of the protein samples in a polyacrylamide gel (e.g., 8%- 20% SDS-PAGE 
depending on the molecular weight of the antigen), transferring the protein sample from the 
polyacrylamide gel to a membrane such as nitrocellulose, PVDF or nylon, blocking the 
membrane in blocking solution (e.g., PBS with 3% BSA or non-fat milk), washing the 
membrane in washing buffer (e.g., PBS-Tween 20), incubating the membrane with primary 
antibody (the antibody of interest) diluted in blockmg buffer, washing the membrane in 
washing buffer, incubating the membrane with a secondary antibody (which recognizes the 
primary antibody, e.g., an anti-human antibody) conjugated to an enzymatic substrate (e.g., 
horseradish peroxidase or alkaline phosphatase) or radioactive molecule (e.g., ^^P or ^^^I) 
diluted in blocking buffer, washing the membrane in wash buffer, and detecting the 
presence of the antigen. One of skill in the art would be knowledgeable as to the 
parameters that can be modified to increase the signal detected and to reduce tlie 
backgroiind noise. For further discussion regarding western blot protocols see, e.g., 
Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol. 1, John Wiley & 
Sons, Inc., New York at 10.8.1. 

[0215] ELISAs comprise preparing antigen, coating the well of a 96 well microtiter 
plate with the antigen, adding the antibody of interest conjugated to a detectable compound 
such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) to the 
well and incubating for a period of time, and detecting the presence of the antigen. In 
ELISAs the antibody of interest does not have to be conjugated to a detectable compound; 
instead, a second antibody (which recognizes the antibody of interest) conjugated to a 
detectable compound may be added to the well. Further, instead of coating the well with 
the antigen, the antibody may be coated to the well. In this case, a second antibody 
conjugated to a detectable compound may be added following the addition of the antigen of 
interest to the coated well. One of skill in the art would be knowledgeable as to the 
parameters that can be modified to increase the signal detected as well as other variations of 
ELISAs known in the art. For flirther discussion regarding ELISAs see, e.g., Ausubel et al, 
eds, 1994, Current Protocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc., New 
York at 11.2.1. 

[0216] Another antibody based separation that can be used to detect the protein of 
interest is the use of flow cytometry such as by a florescence activated cell sorter ("FACS"). 
Typically, separation by flow cytometry is performed as follows. The suspended mixture of 

-84- 



wo 2004/065561 PCT/US2004/001643 

cells are centrifixged and resuspended in media. Antibodies which are conjugated to 
fluorochrome are added to allow the binding of the antibodies to specific proteins. In 
another embodiment, the secondary antibodies that are conjugated to fluorochromes can be 
used to detect primary antibodies specific to the protein of interest. The cell mixture is then 
washed by one or more centrifugation and resuspension steps. The mixture is run through a 
FACS which separates the cells based on different fluorescence characteristics. FACS 
systems are available in varying levels of performance and ability, including multi-color 
analysis. The facilitating cell can be identified by a characteristic profile of forward and 
side scatter which is influenced by size and granularity, as well as by positive and/or 
negative expression of certain cell surface markers. 

[0217] In addition to measuring the effect of a compound identified in the reporter 
gene-based assays described herein on the expression of the target gene from which the 
untranslated regions of the reporter gene construct were derived, the activity of the protein 
encoded by the target gene can be assessed utilizing techniques well-known to one of skill 
in the art. For example, the activity of a protein encoded by a target gene can be determined 
by detecting induction of a cellular second messenger (e.g., intracellular Ca^"^, 
diacylglycerol, IP3, etc.), detecting the phosphorylation of a protein, detecting the activation 
of a transcription factor, or detecting a cellular response, for example, cellular 
differentiation, or cell proliferation. The induction of a cellular second messenger or 
phosporylation of a protein can be determined by, e.g. , immunoassays well-knoAvn to one 
of skill in the art and described herein. The activation of a transcription factor can be 
detected by, e.g., electromobility shift assays, and a cellular response such as cellular 
proUferation can be detected by, e.g., trypan blue cell counts, ^H-thymidine incorporation, 
and flow cytometry. 

5.6. Methods for Characterizing the Compounds That Modulate 
Untranslated Region-Dependent Expression of a Target Gene 

[0218] If the library comprises arrays or microarrays of compounds, wherein each 
compound has an address or identifier, the compound can be deconvoluted, e.g., by cross- 
referencing the positive sample to original compound Hst that was applied to the individual 

test assays. 

[0219] If the library is a peptide or nucleic acid library, the sequence of the compound 
can be determined by direct sequencing of the peptide or nucleic acid. Such methods are 
well known to one of skill in the art. 

[0220] A number of physico-chemical techniques can be used for the de novo 
characterization of compounds bound to the target RNA. Examples of such techniques 
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include, but are not limited to, mass spectrometry, NMR spectroscopy, X-ray 
crystallography and vibrational spectroscopy. 



5.6.1. Mass Spectrometry 

[0221] Mass spectrometry {e.g., electrospray ionization ("ESI"), matrix-assisted laser 
desoiption-ionization ("MALDI"), and Fourier-transform ion cyclotron resonance ("FT- 
ICR") can be used for elucidating the structure of a compound. 
[0222] MALDI uses a pulsed laser for desoiption of the ions and a time-of-flight 
analyzer, and has been used for the detection of noncovalent tEiNA:amino-acyl-tRNA 
synthetase complexes (Gruic-Sovulj et al., 1997, J. Biol. Chem. 272:32084-32091). 
However, covalent cross-linking between the target nucleic acid and the compound is 
required for detection, since a non-covalently bound complex may dissociate during the 
MALDI process. 

[0223] ESI mass spectrometry ("ESI-MS") has been of greater utility for studying non- 
covalent molecular interactions because, unlike the MALDI process, ESI-MS generates 
molecular ions with httle to no fragmentation (Xavier et al., 2000, Trends Biotechnol. 
18(8):349-356). ESI-MS has been used to study the complexes formed by HIV Tat peptide 
and protein with the TAR RNA (Sannes-Lowery et al., 1997, Anal. Chem. 69:5130-5135). 
[0224] Fourier-transform ion cyclotron resonance ('TT-ICR") mass spectrometry 
provides high-resolution spectra, isotope-resolved precursor ion selection, and accurate 
mass assignments (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356). FT-ICRhas 
been used to study the interaction of aminoglycoside antibiotics with cognate and non- 
cognate RNAs (Hofstadler et al., 1999, Anal. Chem. 71 :3436-3440; and Griffey et al., 1999, 
Proc. Natl. Acad. Sci. USA 96:10129-10133). As true for all of the mass spectrometry 
methods discussed herein, FT-ICR does not require labeling of the target RNA or a 
compound. 

[0225] An advantage of mass spectroscopy is not only the elucidation of the structure of 
the compound, but also the determination of the structure of the compound bound to a target 
RNA. Such information can enable the discovery of a consensus structure of a compound 
that specifically binds to a target RNA. 

5.6.2. NMR Spectroscopy 

[0226] NMR spectroscopy is a valuable technique for identifying complexed target 
nucleic acids by qualitatively determining changes in chemical shift, specifically from 
distances measured using relaxation effects, and NMR-based approaches have been used in 
the identification of small molecule binders of protein drug targets (Xavier et al., 2000, 
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Trends Biotechnol. 18(8):349-356). The determination of structure-activity relationsliips 
("SAR") by NMR is the first method for NMR described in which small molecules that 
bind adjacent subsites are identified by two-dimentional ^H-^^N spectra of the target protein 
(Shuker et al., 1996, Science 274:1531-1534). The signal from the bound molecule is 
monitored by employing line broadening, transferred NOEs and pulsed field gradient 
diffusion measurements (Moore, 1999, Cuir. Opin. Biotechnol. 10:54-58). A strategy for 
lead generation by NMR using a library of small molecules has been recently described 
(Fejzo et al, 1999, Chem. Biol. 6:755-769). 

[0227] hi one embodiment of the present invention, the target nucleic acid complexed 
to a compound can be determined by SAR by NMR. Furthermore, SAR by NMR can also 
be used to elucidate the structure of a compound. 

[0228] As described above, NMR spectroscopy is a technique for identifjdng binding 
sites in target nucleic acids by qualitatively determining changes in chemical shift, 
specifically from distances measured using relaxation effects. Examples of NMR that can 
be used for the invention include, but are not limited to, one-dimentional NMR, two- 
dimentional NMR, con-elation spectroscopy ("COSY"), and nuclear Overhauser effect 
("NOE") spectroscopy. Such methods of structure determination of compounds are well- 
Imown to one of skill in the art. 

[0229] Similar to mass spectroscopy, an advantage of NMR is the not only the 
elucidation of the structure of the compound, but also the determination of the structure of 
the compound bound to the target RNA. Such infonnation can enable the discovery of a 
consensus structure of a compound that specifically binds to a target RNA. 

5.6.3. X ray Crystallography 

[0230] X-ray crystallography can be used to elucidate the structure of a compound. For 

a review of x-ray crystallography see, e.g., Blundell et al., 2002, Nat Rev Drug Discov 

l(l):45-54. The first step in x-ray crystallography is the formation of crystals. The 

formation of crystals begins with the preparation of highly purified and soluble samples. 

The conditions for crystaUization are then determined by optimizing several solution 

variables known to induce nucleation, such as pH, ionic strength, temperature, and specific 

concentrations of organic additives, salts and detergent. Techniques for automating the 

crystallization process have been developed for the production of high-quality protein 

crystals. Once crystals have been formed, the crystals are harvested and prepared for data 

collection. The crystals are then analyzed by dif&action (such as multi-circle 

dififractometers, high-speed CCD detectors, and detector off-set). Generally, multiple 

crystals must be screened for structure determinations. 
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5.6.4. Vibrational Spectroscopy 
[0231] Vibrational spectroscopy (e.g. infrared (ER.) spectroscopy or Raman 
spectroscopy) can be used for elucidating the structure of a compound. 
[0232] Infrared spectroscopy measures the frequencies of infrared light (wavelengths 
from 100 to 10,000 nm) absorbed by the compound as a result of excitation of vibrational 
modes according to quantum mechanical selection rules which require that absorption of 
Ught cause a change in the electric dipole moment of the molecule. The infrared spectrum 
of any molecule is a unique pattern of absorption wavelengths of varying intensity that can 
be considered as a molecular fingerprint to identify any compound. 
[0233] Infi-ared specfra can be measured in a scanning mode by measuring the 
absorption of individual frequencies of light, produced by a grating which separates 
firequencies from a mixed-frequency infrared light source, by the compound relative to a 
standard intensity (double-beam instrument) or pre-measured ('blank') intensity 
(single-beam instrument). In a preferred embodiment, infrared spectra are measured in a 
pulsed mode ("FT-IR") where a mixed beam, produced by an interferometer, of all infrared 
light firequencies is passed through or reflected off the compound. The resulting 
interferograiTL, which may or may not be added with the resulting interferograms from 
subsequent pulses to increase the signal strength while averaging random noise in the 
electronic signal, is mathematically transformed into a spectrum using Fourier Transform or 
Fast Fourier Transform algorithms. 

[0234] Raman spectroscopy measures the difference in frequency due to absorption of 
infrared frequencies of scattered visible or ultraviolet light relative to the incident beam. 
The incident monochromatic light beam, usually a single laser frequency, is not truly 
absorbed by the compound but interacts with the electric field transiently. Most of the light 
scattered off the sample will be unchanged (Rayleigh scattering) but a portion of the scatter 
light will have frequencies that are the sum or difference of the incident and molecular 
vibrational frequencies. The selection rules for Raman (inelastic) scattering require a 
change in polarizability of the molecule. While some vibrational transitions are observable 
in both infrared and Raman specfrometry, must are observable only with one or the other 
technique. The Raman spectrum of any molecule is a unique pattern of absorption 
wavelengths of varying intensity that can be considered as a molecular fingerprint to 
identify any compound. 

[0235] Raman spectra are measured by submitting monochromatic hght to the sample, 
either passed through or preferably reflected off, filtering the Rayleigh scattered light, and 
detecting the firequency of the Raman scattered light. An improved Raman spectrometer is 
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"descnbed m US Patent No. 5,786,893 to Fink et al., which is hereby incorporated by 
reference. 

[0236] Vibrational microscopy can be measured in a spatially resolved fashion to 
address single beads by integration of a visible microscope and spectrometer. A 
microscopic infrared spectrometer is described in U.S. Patent No. 5,581,085 to Reffiier et 
al., which is hereby incorporated by reference in its entirety. An instrument that 
simultaneously performs a microscopic infrared and microscopic Raman analysis on a 
sample is described in U.S. Patent No. 5,841,139 to Sostek et al., which is hereby 
incorporated by reference in its entirety. 

[0237] hi one embodiment of the method, compoimds are synthesized on polystyrene 
beads doped with chemically modified styrene monomers such that each resulting bead has 
a characteristic pattern of absorption lines in the vibrational (IR or Raman) spectrum, by 
methods including but not limited to those described by Feimiri et al., 2000, J. Am. Chem. 
Soc. 123:8151-8152. Usuig methods of split-pool synthesis famihar to one of skiU in the 
art, the library of compoimds is prepared so that the spectroscopic pattern of the bead 
identifies one of the components of the compound on the bead. Beads that have been 
separated according to their ability to bind target RNA can be identified by their vibrational 
spectrum. In one embodiment of the method, appropriate sorting and binning of the beads 
during synthesis then allows identification of one or more further components of tlie 
compound on any one bead. In another embodiment of the method, partial identification of 
the compound on a bead is possible through use of the spectroscopic pattern of the bead 
with or without the aid of further sorting during synthesis, followed by partial resynthesis of 
the possible compounds aided by doped beads and appropriate sorting during sjoithesis, 
[0238] hi another embodiment, the IR or Raman spectra of compounds are examined 
while the compound is still on a bead, preferably, or after cleavage fi-om bead, using 
methods including but not limited to photochemical, acid, or heat treatment. The compound 
can be identified by comparison of the IR or Raman specfral pattern to spectra previously 
acquired for each compound in the combinatorial library. 

5.7. Secondary Screens of Compounds 
[0239] Once a compound has been identified to modulate untranslated region- 
dependent expression of a target gene and preferably, the structure of the compound has 
been identified by the methods described in Section 5.6, the compounds are tested for 
biological activity in further assays and/or animal models (see, e.g.. Sections 5.7.1 and 
5.7.2). Further, a lead compound may be used to design congeners or analogs (see, e.g.. 
Section 5.7.3). 
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5.7.1. Cell-based Screens 
[0240] The compounds identified in the assays described supra (for convenience 
referred to herein as a "lead" compound) can be tested for biological activity using host 
cells containing or engineered to contain the target RNA element involved in untranslated 
region-dependent gene expression coupled to a functional readout system. For example, a 
phenotypic or physiological readout can be used to assess untranslated region-dependent 
activity of the target RNA in the presence and absence of the lead compound. 
[0241] In one embodiment, a phenotypic or physiological readout can be used to assess 
untranslated region-dependent activity of the target RNA in the presence and absence of the 
lead compoimd. For example, the target RNA may be overexpressed in a cell in which the 
target RNA is endogenously expressed. Where the target RNA controls untranslated 
region-dependent expression of a gene product involved in cell growth or viability, the in 
vivo effect of the lead compound can be assayed by measuring the cell growth or viability 
of the target cell. Such assays can be carried out with representative cells of cell types 
involved in a particular disease or disorder {e.g., leukocytes such as T cells, B cells, natural 
killer cells, macrophages, neutrophils and eosmophils). A lower level of proliferation or 
survival of the contacted cells indicates that the lead compound is effective to treat a 
condition in the patient characterized by uncontrolled cell growth. Alternatively, instead of 
culturing cells from a patient, a lead compound may be screened using cells of a tumor or 
mahgnant cell line or an endothelial cell line. Specific examples of cell culture models 
include, but arc not limited to, for lung cancer, primary rat lung tumor cells (see, e.g., 
Swafford et al., 1997, Mol. Cell. Biol,, 17:1365-1374) and large-cell undifferentiated cancer 
cell lines (see, e.g., Mabry et al., 1991, Cancer Cells, 3:53-58); colorectal cell lines for 
colon cancer (see, e.g., Park & Gazdar, 1996, J. Cell Biochem. Suppl. 24:131-141); multiple 
estabKshed cell lines for breast cancer (see, e.g., Hambly et al., 1997, Breast Cancer Res. 
Treat. 43:247-258; Gierthy et al., 1997, Chemosphere 34:1495-1505; and Prasad & Church, 
1997, Biochem. Biophys. Res. Commun. 232:14-19); a number of well-characterized cell 
models for prostate cancer (see, e.g., Webber et al., 1996, Prostate, Part 1, 29:386-394; Part 
2, 30:58-64; and Part 3, 30:136-142 and Boulikas, 1997, Anticancer Res. 17:1471-1505); 
for genitourinary cancers, continuous human bladder cancer cell lines (see, e.g., Ribeiro et 
al., 1997, Int. J. Radiat. Biol. 72:11-20); organ cultures of transitional cell carcinomas (see, 
e.g., Booth et al., 1997, Lab Invest. 76:843-857) and rat progression models (see, e.g.. Vet 
et al., 1997, Biochim. Biophys Acta 1360:39-44); and estabKshed cell lines for leukemias 
and lymphomas (see, e.g., Drexler, 1994, Leuk. Res. 18:919-927 and Tohyama, 1997, hit. J. 
Hematol. 65:309-317). 
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' [0242] Many assays weirknown in the art can be used to assess the survival and/or 
growth of a patient cell or cell line following exposure to a lead compound; for example, 
cell proliferation can be assayed by measuring bromodeoxyuridine (BrdU) incoiporation 
(see, e.g., Hoshino et al., 1986, M. J. Cancer 38:369 and Campana et al., 1988, J. Immunol. 
Meth. 107:79) or (^H)-thymidine incorporation (see, e.g., Chen, 1996, Oncogene 13:1395- 
403 and Jeoung, 1995, J. Biol. Chem. 270:18367-73), by direct cell count, by detecting 
changes in transcription, translation or activity of known genes such as proto-oncogenes 
{e.g. Jos, myc) or cell cycle markers (Rb, cdc2, cyclin A, Dl, D2, D3, E, etc?). The levels 
of such protein and mRNA and activity can be determined by any method well known in 
the art. For example, protein can be quantitated by known immunodiagnostic methods such 
as western blotting or immunoprecipitation using commercially available antibodies. 
mRNA can be quantitated using methods that are well known and routine in the art, for 
example, using northern analysis, RNase protection, the polymerase chain reaction in 
connection with reverse transcription ("RT-PCR"). Cell viability can be assessed by using 
trypan-blue staining or other cell death or viability markers known in the art. hi a specific 
embodiment, the level of cellular ATP is measured to determined cell viability. 
Differentiation can be assessed, for example, visually based on changes in morphology. 
[0243] The lead compound can also be assessed for its ability to inhibit cell 
transformation (or progression to malignant phenotype) in vitro. In this embodiment, cells 
with a transformed cell phenotype are contacted with a lead compound, and examined for 
change in characteristics associated with a transformed phenotype (a set of in vitro 
characteristics associated with a tumorigenic ability in vivo), for example, but not limited to, 
colony foraiation in soft agar, a more rounded cell morphology, looser substratum 
attachment, loss of contact inhibition, loss of anchorage dependence, release of proteases 
such as plasminogen activator, increased sugar transport, decreased serum requirement, or 
expression of fetal antigens, etc. (see, e.g., Luria et al., 1978, General Virology, 3d Ed., 
John Wiley & Sons, New York, pp. 436-446). 

[0244] Loss of invasiveness or decreased adhesion can also be assessed to demonstrate 
the anti-cancer effects of a lead compound. For example, an aspect of the formation of a 
metastatic cancer is the ability of a precancerous or cancerous cell to detach from primary 
site of disease and establish a novel colony of growth at a secondary site. The ability of a 
cell to invade peripheral sites reflects its potential for a cancerous state. Loss of 
invasiveness can be measured by a variety of techniques known in the art including, for 
example, induction of E-cadherin-mediated cell-cell adhesion. Such E-cadherin-mediated 
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Mhesidn can result in phenotypic reversion and loss of invasiveness (see, e.g., Hordijk et 
al., 1997, Science 278:1464-66). 

[0245] Loss of invasiveness can further be examined by inhibition of cell migration. A 
variety of 2-dimensional and 3 -dimensional cellular matrices are commercially available 
(Calbiochem-Novabiochem Corp. San Diego, CA). Cell migration across or into a matrix 
can be examined using microscopy, time-lapsed photography or videography, or by any 
method in the airt allowing measurement of cellular migration. In a related embodiment, 
loss of invasiveness is examined by response to hepatocj^e growth factor ("HGF"). HGF- 
induced cell scattering is correlated with invasiveness of cells such as Madin-Darby canine 
kidney ("MDCK") cells. This assay identifies a cell population that has lost cell scattering 
activity in response to HGF (see, e.g., Hordijk et al., 1997, Science 278:1464-66). 
[0246] Alternatively, loss of invasiveness can be measured by cell migration through a 
chemotaxis chamber (Neuroprobe/ Precision Biochemicals Inc. Vancouver, BC). In such 
assay, a chemo-attractant agent is incubated on one side of the chamber {e.g., the bottom 
chamber) and cells are plated on a filter separating the opposite side (e.g., the top chamber). 
In order for cells to pass from the top chamber to the bottom chamber, the cells must 
actively migrate through small pores ia the filter. Checkerboard analysis of the number of 
cells that have migrated can then be coiTelated with invasiveness (see e.g., Ohnishi, 1993, 
Biochem. Biophys. Res. Commun. 193:5 18-25). 

[0247] The effect of a compound of the invention on cell adhesion can be measured 
using HUVECS. HUVECs are seeded on 24 well culture plates and incubated for 2 days to 
allow formation of a confluent monolayer. Cancerous cells or a cancer cell line such as LS- 
180 human colon adenocarcinoma cells are labeled with 5 |4.M Calcein-AM for 30 min. 
Calcein-AM labeled LSI 80 cells are added into each well of the HUVEC culture; and 
incubated for 10 min at 37''C. TNF-a (80 ng/ml) is then added and the culture incubated for 
is an additional 1 10 min. Non-adherent cells are removed by washing with PBS. The 
fluorescence mtensity of adherent LS-180 cell in each individual well is measured by a 
fluorescent plate reader set at excitation 485/20 nm and emission at 530/25 nm. 
[0248] The effect of a compound of the invention on cell migration and invasion can 
also be determined using an assay based on the BD BioCoast Angiogenesis System (BD 
Biosciences, Bedford, MA). The fluorescence blocking membrane of the insert is a 3 
micron pore size PET filter which has been coated either with BD Matrigel basement matrix 
(for invasion assay) or without Matrigel matrix (for migration assay). HUVECs (250 
|il/well) in culture medixmi without serum are added to the top chamber; a compound added 
to bottom wells containing medium (750 jaVwell) with VEGF as a chemo-attractant. Cells 
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are then incubated for 22 h at SVC. After incubation, cells axe stained with Calcein AM for 
measurement of fluorescence. 

[0249] Further, a lead compound can be assessed for its ability to alter viral replication 
(as determined, e.g., by plaque foimation) or the production of viral proteins (as 
determined, e.g., by western blot analysis) or viral RNAs (as determined, e.g., by RT-PCR 
or northern blot analysis) in cultured cells in vitro using methods which are well known in 
the art. A lead compound can also be assessed for its abihty to alter bacterial replication (as 
determined, e.g., by measuring bacterial growth rates) or the production of bacterial 
proteins (as determined, e.g., by western blot analysis) or bacterial RNAs (as determined, 
e.g., by RT-PCR or northern blot analysis) in cultured cells in vitro using methods which 
are well known in the art. Finally, a lead compound can be assessed for its ability to alter 
fungal replication (as determined, e.g., by fungal growth rates, such as macrodilution 
methods and/or microdilution methods using protocols known to those skilled in the art 
(see, e.g., Clancy et al., 1997, Journal of Clinical Microbiology, 35(11): 2878-82; Ryder et 
al., 1998, Antimicrobial Agents and Chemotherapy, 42(5): 1057-61; or U.S. Patent Nos. 
5,521,153; U.S. 5,883,120, U.S. 5,521,169)) or the production of fungal proteins (as 
determined, e.g., by western blot analysis) or fungal RNAs (as detennined, e.g., by RT-PCR 
or northern blot analysis) in cultured cells in vitro using methods which are well known in 
the art. 

5.7.2. Animal Model-based Screens 
[0250] The lead compounds identified in the reporter gene-based assay described herein 
can be tested for biological activity using animal models for a disease, disorder, condition, 
or syndrome of interest. These include animals engineered to contain a target gene coupled 
to a functional readout system, such as a transgenic mouse. Such animal model systems 
include, but are not limited to, rats, mice, chicken, cows, monkeys, pigs, dogs, rabbits, etc. 
In a specific embodiment of the invention, a compound identified in accordance with the 
methods of the invention is tested in a mouse model sjretem. Such model systems are 
widely used and well-known to the skilled artisan such as the SCID mouse model or 
transgenic mice. 

[0251] The anti-angjogemc activity of a compound identified in accordance with the 

invention can be determined by using various experimental animal models of vascularized 

tumors. The anti-tumor activity of a compound identified in accordance with the invention 

can be determined by administering the compotmd to an animal model and verifying that 

the compound is effective in reducing the proliferation or spread of cancer cells in said 

animal model. An example of an animal model for human cancer in general includes, but is 

-93- 



wo 2004/065561 PCT/US2004/001643 

not limited to, spontaneously occurring tumors of companion animals (see, e.g., Vail & 
MacEwen, 2000, Cancer Invest 18(8):781-92). 

[0252] Examples of animal models for lung cancer include, but are not limited to, lung 
cancer animal models described by Zhang & Roth (1994, La Vivo 8(5):755-69) and a 
transgenic mouse model with disrupted p53 function (see, e.g., Morris et al., 1998, J La 
State Med Soc 150(4): 179-85). An example of an animal model for breast cancer includes, 
but is not limited to, a transgenic mouse that overexpresses cyclin Dl (see, e.g., Hosokawa 
et al., 2001, Transgenic Res 10(5):471-8). An example of an animal model for colon cancer 
includes, but is not limited to, a TCRb audp53 double knockout mouse (see, e.g., Kado et 
al., 2001, Cancer Res 61(6):2395-8). Examples of animal models for pancreatic cancer 
include, but are not limited to, a metastatic model of Panc02 murine pancreatic 
adenocarcinoma (see, e.g., Wang et al, 2001, Lit J Pancreatol 29(l):37-46) and nu-nu mice 
generated in subcutaneous pancreatic tumours (see, e.g., Ghaneh et al., 2001, Gene Ther 
8(3):199-208). Examples of animal models for non-Hodgkin's lymphoma include, but are 
not limited to, a severe combined iminunodeficiency ("SCID") mouse (see, e.g., Bryant et 
al., 2000, Lab hivest 80(4):553-73) and an IgHmu-HOXl 1 transgenic mouse (see, e.g., 
Hough et al., 1998, Proc Natl Acad Sci USA 95(23): 13853-8). An example of an animal 
model for esophageal cancer includes, but is not limited to, a mouse transgenic for the 
human papillomavirus type 16 E7 oncogene (see, e.g., Herber et al., 1996, J Virol 
70(3): 1873-81). Examples of animal models for colorectal carcinomas include, but are not 
limited to, Ape mouse models (see, e.g., Fodde & Smits, 2001, Trends Mol Med 7(8):369- 
73 and Kuraguchi et al., 2000, Oncogene 19(50):5755-63). 

[0253] The anti-inflammatory activity of a compound identified in accordance with the 
invention can be determined by using various experimental animal models of inflammatory 
arthritis known in the art and described in Crofford & Wilder, "Artlxritis and Autoimmunity 
in Animals", in Arthritis and Allied Conditions: A Textbook of Rheumatology, McCarty et 
al.(eds.), Chapter 30 (Lee & Febiger, 1993). Experimental and spontaneous animal models 
of inflammatory arthritis and autoimmune rheumatic diseases can also be used to assess the 
anti-inflammatory activity of a compound identified in accordance with the invention. The 
principle animal models for arthritis or inflaromatory disease known in the art and widely 
used include: adjuvant-induced arthritis rat models, collagen-induced arthritis rat and mouse 
models and antigen-induced arthritis rat, rabbit and hamster models, all described in 
Crofford & Wilder, "Arthritis and Autoimmunity in Animals", in Arthritis and Alhed 
Conditions: A Textbook of Rheumatology, McCarty et al.(eds.). Chapter 30 (Lee & 
Febiger, 1993), incorporated herein by reference in its entirety. 
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[0254] The anti-inflammatory activity of a compound identified in accordance with the 
invention can be assessed using a carrageenan-induced arthritis rat model. Can-ageenan- 
induced arthritis has also been used in rabbit, dog and pig in studies of chronic arthritis or 
inflammation. Quantitative histomorphometric assessment is used to determine therapeutic 
efficacy. The methods for using such a carrageenan-induced arthritis model is described in 
Hansra et al., 2000, hiflammation, 24(2): 141-155. Also commonly used are zymosan- 
induced inflammation animal models as known and described in the art. 
[0255] The anti-inflammatory activity of a compound identified in accordance with the 
invention can also be assessed by measuring the inhibition of carrageenan-induced paw 
edema in the rat, using a modification of the method described in Winter et al., 1962, Proc. 
Soc. Exp. Biol Med. Ill, 544-547. This assay has been used as a primary in vivo screen for 
the anti-inflammatory activity of most NSAIDs, and is considered predictive of human 
efficacy. The anti-inflammatory activity of a compound identified in accordance with the 
invention is expressed as the percent inhibition of the increase in hind paw weight of the 
test group relative to the vehicle dosed control group. 

[0256] In a specific embodiment of the invention where the experimental animal model 
used is adjuvant-induced arthritis rat model, body weight can be measured relative to a 
control group to detennine the anti-inflammatory activity of a compound identified in 
accordance with the invention. Alternatively, the efficacy of a compound identified in 
accordance with the invention can be assessed using assays that determine bone loss. 
Animal models such as ovariectomy-induced bone resorption mice, rat and rabbit models 
are known in the art for obtaining dynamic parameters for bone formation. Using methods 
such as those described by Yositake et al, or Yamamoto et al., bone volume is measured in 
vivo by microcomputed tomography analysis and bone histomorphometry analysis (see, 
e.g., Yoshitake et al., 1999, Proc. Natt. Acad. Sci. 96:8156-8160 and Yamamoto et al., 
1998, Endocrinology 139(3): 141 1-1419, both incorporated herein by reference in their 
entireties). 

[0257] Additionally, animal models for inflammatory bowel disease can also be used to 
assess the efficacy of a compound identified in accordance with the invention (see, e.g., 
Kim et al., 1992, Scand. J. Gastroentrol. 27:529-537 and Strober, 1985, Dig. Dis. Sci. 30(12 
Suppl):3S-10S). Ulcerative cholitis and Crohn's disease are human inflammatory bowel 
diseases that can be induced in animals. Sulfated polysaccharides including, but not limited 
to, amylopectin, carrageen, amylopectin sulfate, and dextran sulfate or chemical irritants 
including, but not limited to, trinitrobenzenesulphonic acid (TNBS) and acetic acid can be 
administered to animals orally to induce inflammatory bowel diseases. 
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[0258] Animal models for asthma can also be used to assess the efficacy of a compound 
identified in accordance with the invention. An example of one such model is the murine 
adoptive transfer model in which aeroallergen provocation of THl or TH2 recipient mice 
results in TH effector cell migration to the airways and is associated with an intense 
neutrophilic (THl) and eosinophilic (TH2) lung mucosal inflammatory response (see, e.g., 
Cohn et al., 1997, J. Exp. Med. 1861737-1747). 

[0259] Animal models for autoimmune disorders can also be used to assess the efficacy 
of a compound identified in accordance with the iavention. Animal models for autoimmune 
disorders such as type 1 diabetes, thyroid autoimmunity, sytemic lupus erutheniatosus, and 
glomerulonephritis have been developed (see, e.g., Flanders et al., 1999, Autoimmunity 
29:235-246; Krogh et al., 1999, Biochimie 81:511-515; and Foster, 1999, Semin. Nephrol. 
19:12-24). 

[0260] Animal models for viral infections can also be used to assess the efficacy of a 
compound identified in accordance with the invention. Animal models for viral infections 
such as EBV-associated diseases, gammaherpesviruses, infectious mononucleosis, simian 
immunodeficiency virus ("SIV"), Boma disease vims infection, hepatitis, varicella virus 
infection, viral pneumonitis, Epstein-Barr virus pathogenesis, feline immunodeficiency 
virus ("FIV"), HTLV type 1 infection, human rotaviruses, and genital herpes have been 
developed (see, e.g, Hayashi et al., 2002, Histol Histopathol 17(4): 1293-31 0; Arico et al., 
2002, J Interferon Cytokine Res 22(1 1):1081-8; Flano et al., 2002, Immunol Res 25(3):201- 
17; Sauermann, 2001, Curr Mol Med 1(4):5 15-22; Pletmkov et al., 2002, Front Biosci 
7:d593-607; Engler et al., 2001, Mol Immunol 38(6):457-65; White et al., 2001, Brain 
Pathol ll(4):475-9; Davis & Matalon, 2001, News Physiol Sci 16:185-90; Wang, 2001, 
Curr Top Microbiol Immunol. 258:201-19; Phillips et al., 2000, J Psychopharmacol. 
14(3):244-50; Kazanji, 2000, AIDS Res Hum Retrovirases. 16(16):1741-6; Saif et al., 1996, 
Arch Virol Suppl. 12:153-61; and Hsiung et al, 1984, Rev hifect Dis. 6(l):33-50). 
[0261] Animal models for bacterial infections can also be used to assess the efficacy of 
a compound identified in accordance with the invention. Animal models for bacterial 
infections such as H. /)j;/or/-infection, genital mycoplasmosis, primary sclerosing 
cholangitis, cholera, clnonic lung infection with Pseudomonas aeruginosa. Legionnaires' 
disease, gastroduodenal ulcer disease, bacterial meningitis, gastric Helicobacter infection, 
pneumococcal otitis media, experimental allergic neuritis, leprous neuropathy, 
mycobacterial infection, endocarditis, Aeromonas-associated enteritis, Bacteroides fragilis 
infection, syphilis, streptococcal endocarditis, acute hematogenous osteomyelitis, human 
scrub typhus, toxic shock syndrome, anaerobic infections, Escherichia coli infections, and 
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"'Mycoplasma pneumoniae infections have been developed (see, e.g., Sugiyama et al., 2002, 
J Gastroenterol. 37 Suppl 13:6-9; Brown et al., 2001, Am J Reprod Immunol. 46(3):232-41; 
Vierling, 2001, Best Pract Res Clin Gastroenterol. 15(4):591-610; Klose, 2000, Trends 
Microbiol. 8(4): 189-91; Stotland et al., 2000, Pediatr Pulmonol. 30(5):413-24; Brieland et 
al, 2000, Immimopharmacology 48(3):249-52; Lee, 2000, Baillieres Best Pract Res Clin 
Gastroenterol. 14(l):75-96; Koedel & Pfister, 1999, Infect Dis ClinNortli Am. 13(3):549- 
77; Nedrud, 1999, FEMS Immunol Med Microbiol. 24(2):243-50; Prellner et al., 1999, 
Microb Drug Resist. 5(l):73-82; Vriesendorp, 1997, J Infect Dis. 176 Suppl 2:S164-8; 
Shetty & Antia, 1996, Indian J Lepr. 68(1):95-104; Balasubramanian et al., 1994, 
Immunobiology 191(4-5):395-401; Carbon et al., 1994, Int J Biomed Comput. 36(l-2):59- 
67; Haberberger et al., 1991, Experientia. 47(5):426-9; Onderdonk et al., 1990, Rev Infect 
Dis. 12 Suppl 2:3169-77; Wicher&Wicher, 1989, Crit Rev Microbiol. 16(3):181-234; 
Scheld, 1987, J Antimicrob Chemother. 20 Suppl A:71-85; Emslie &Nade, 1986, Rev 
Infect Dis. 8(6):841-9; Ridgway et al, 1986, Lab Anim Sci. 36(5):481-5; Quimby & 
Nguyen, 1985, Crit Rev Microbiol. 12(l):l-44; Onderdonk et al., 1979, Rev Infect Dis. 
1(2):291-301; Smith, 1976, Ciba Found Symp. (42):45-72, and Taylor-Robinson, 1976, 
Infection. 4(1 Suppl):4-8). 

[0262] Aiiimal models for fungal infections can also be used to assess the efficacy of a 
compound identified in accordance with the invention. Animal models for fiingal infections 
sucli as Candida infections, zygomycosis, Candida mastitis, progressive disseminated 
trichosporonosis with latent trichosporonemia, disseminated candidiasis, pulmonary 
paracoccidioidomycosis, pulmonary aspergillosis, Pneumocystis carinii pneumonia, 
cryptococcal meningitis, coccidioidal meningoencephalitis and cerebrospinal vasculitis, 
Aspergillus niger infection, Fusarium keratitis, paranasal sinus mycoses, Aspergillus 
fiimigatus endocarditis, tibial dyschondroplasia, Candida glabrata vaginitis, oropharyngeal 
candidiasis, X-linked chronic granulomatous disease, tinea pedis, cutaneous candidiasis, 
mycotic placentitis, disseminated trichosporonosis, allergic bronchopulmonary 
aspergillosis, mycotic keratitis, Cryptococcus neoformans infection, fungal peritonitis, 
Curvularia geniculata infection, staphylococcal endophthalmitis, sporotrichosis, and 
dermatophytosis have been developed (see, e.g., Arendrup et al., 2002, Infection 30(5):286- 
91; Kamei, 2001, Mycopathologia 152(1):5-13; Guhad et al., 2000, FEMS Microbiol 
Lett.l92(l):27-31; Yamagata et al., 2000, J Clin Microbiol. 38(9):32606; Andrutis et al., 
2000, J Clin Microbiol. 38(6):23 17-23; Cock et al., 2000, Rev hist Med Trop Sao Paulo 
42(2):59-66; Shibuya et al., 1999, Microb Pathog, 27(3):123-31; Beers et al., 1999, J Lab 
Chn Med. 133(5):423-33; Najvar et al., 1999, Antimicrob Agents Chemother.43(2):413-4; 
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Williams et al., 1988, J Infect Dis. 178(4):1217-21; Yoshida, 1988, Kansenshogaku Zasslii. 
1998 Jun;72(6):621-30; Alexandrakis et al., 1998, Br J Ophthalmol. 82(3):306-ll; 
Chakrabarti et al., 1997, J Med Vet Mycol. 35(4):295-7; Martin et al., 1997, Antimicrob 
Agents Chemother. 41(l):13-6; Chu et al, 1996, Avian Dis. 40(3):715-9; Fidel et al., 1996, 
J hifect Dis. 173(2):425-31; Cole et al., 1995, FEMS Microbiol Lett. 15;126(2):177-80; 
Pollock et al., 1995, Nat Genet. 9(2):202-9; Uchida et al., 1994, Jpn J Antibiot. 
47(10):1407-12; : Maebashi et al., 1994, J Med Vet Mycol. 32(5):349-59; Jensen & 
Schonheyder, 1993, J Exp Anim Sci. 35(4):155-60; Gokaslan & Anaissie, 1992, Infect 
Tmrnim. 60(8):3339-44; Kmup et al., 1992, J Immunol. 148(12):3783-8; Singh et al, 1990, 
Mycopathologia. 112(3): 127-37; Salkowski & Balish, 1990, hifect Immun. 58(10):3300-6; 
Ahmad et al., 1986, Am J Kidney Dis. 7(2):153-6; Alture-Werber E, Edberg SC, 1985, 
Mycopathologia. 89(2):69-73; Kane et al, 1981, Antimicrob Agents Chemother. 20(5):595- 
9; Barbee et al., 1977, Am J Pathol. 86(l):281-4; and Maestrone et al., 1973, Am J Vet Res. 
34(6):833-6). 

[0263] The toxicity and/or efficacy of a compound identified in accordance with the 
invention can be detennined by standard pharmaceutical procedures in cell cultures or 
experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
population) and the ED50 (the dose therapeutically effective in 50% of the population). The 
dose ratio between toxic and therapeutic effects is the therapeutic index and it can be 
expressed as the ratio LD50/ED50. A compound identified in accordance with the invention 
that exhibits large therapeutic indices is prefen-ed. While a compound identified in 
accordance with the invention that exhibits toxic side effects may be used, care should be 
taken to design a delivery system that targets such agents to the site of affected tissue in 
order to minimize potential damage to uninfected cells and, thereby, reduce side effects. 
[0264] The data obtained from the cell culture assays and animal studies can be used in 
formulating a range of dosage of a compound identified in accordance with the invention 
for use in humans. The dosage of such agents lies preferably within a range of circulating 
concentrations that include the ED50 with little or no toxicity. The dosage may vary within 
this range depending upon the dosage form employed and the route of administration 
utilized. For any agent used in the method of the invention, the therapeutically effective 
dose can be estimated initially from cell culture assays. A dose may be formulated in 
animal models to achieve a circulating plasma concentration range that includes the IC50 
(i.e., the concentration of the compound that achieves a half-maximal inhibition of 
symptoms) as determined in cell culture. Such information can be used to more accurately 
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detennine useful doses in humans. Levels in plasma may be measured, for example, by 
high performance liquid chromatography. 



5.7.3. Design of Congeners or Analogs 
[0265] The compounds which display the desired biological activity can be used as lead 
compounds for the development or design of congeners or analogs having usefiil 
pharmacological activity. For example, once a lead compound is identified, molecular 
modeling techniques can be used to design variants of the compound that can be more 
effective. Examples of molecular modeling systems are the CHARM and QUANTA 
programs (Polygen Corporation, Waltham, MA). CHARM performs the energy 
minimisation and molecular dynamics fimctions. QUANTA performs the construction, 
graphic modelling and analysis of molecular structure. QUANTA allows interactive 
construction, modification, visualization, and analysis of the behavior of molecules with 
each other. 

[0266] A number of articles review computer modeUng of drugs interactive with 
specific proteins, such as Rotivinen et al., 1988, Acta Pharmaceutical Fennica 97:159-166; 
Ripka, 1998, New Scientist 54-57; McKinaly & Rossmann, 1989, Annu. Rev. Pharmacol. 
Toxiciol. 29:111-122; Perry & Davies, OSAR: Quantitative Structure- Activity 
Relationships in Drug Design pp. 189-193 (AlanR. Liss, Inc. 1989); Lewis & Dean, 1989, 
Proc. R. Soc. Lond. 236:125-140 and 141-162; Askew et al., 1989, J. Am. Chem. Soc. 
111:1082-1090. Other computer programs that screen and graphically depict chemicals are 
available from companies such as BioDesign, hic. (Pasadena, CaHfomia), AUeUx, Inc. 
(Mississauga, Ontario, Canada), and Hypercube, Inc. (Cambridge, Ontario). Although 
these are primarily designed for application to drugs specific to particular proteins, they can 
be adapted to design of drugs specific to any identified region. The analogs and congeners 
can be tested for binding to the target RNA using the above-described secondary screens for 
biologic activity. Alternatively, lead compounds with Uttle or no biologic activity, as 
ascertained in the secondary screen, can also be used to design analogs and congeners of the 
compound that have biologic activity. 

5.8. Use of Identified Compounds Tliat Modulate Untranslated 

Region-Dependent Gene Expression to Treat/Prevent Disease 

[0267] Biologically active compounds identified using the methods of the invention or a 

pharmaceutically acceptable salt thereof can be administered to a patient, preferably a 

mammal, more preferably a human, suffering from a disease or disorder whose onset, 

progression, development and/or severity is associated with the expression of a target gene. 
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Alternatively, biologically active compounds identified using the methods of the invention 
or a phaxmaceutically acceptable salt thereof that are beneficial for the treatment of a 
disease or disorder can be administered to a patient, preferably a mammal, more preferably 
a human, suffering from such a disease or disorder. Li a specific embodiment, a compoimd 
or a pharmaceutically acceptable salt thereof is administered to a patient, preferably a 
mammal, more preferably a human, as a preventative measure against a disease or disorder 
associated with an RNA:host cell factor interaction in vivo. 

[0268] When administered to a patient, the compound or a pharmaceutically acceptable 
salt thereof is preferably administered as component of a composition that optionally 
comprises a pharmaceutically acceptable veloicle. The composition can be administered 
orally, or by any other convenient route, for example, by infusion or bolus injection, by 
absorption through epithehal or mucocutaneous linings (e.g., oral mucosa, rectal, and 
intestinal mucosa, etc.) and may be administered together with another biologically active 
agent. Administration can be systemic or local. Various delivery systems are known, e.g., 
encapsulation in liposomes, microparticles, microcapsules, capsules, etc., and can be used 
to administer the compound and phannaceutically acceptable salts thereof 
[0269] Methods of administration include but are not limited to intradermal, 
intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal, epidural, oral, 
sublingual, intranasal, intracerebral, intravaginal, transdermal, rectally, by inhalation, or 
topically, particularly to the ears, nose, eyes, or skin. The mode of administration is left to 
the discretion of the practitioner. In most instances, administration will result in the release 
of the compound or a pharmaceutically acceptable salt thereof into the bloodstream. 
[0270] In specific embodiments, it may be desirable to administer the compound or a 
phannaceutically acceptable salt thereof locally. This maybe achieved, for example, and 
not by way of limitation, by local infusion during surgery, topical application, e.g., in 
conjunction with a wound dressing after surgery, by injection, by means of a catheter, by 
means of a suppository, or by means of an implant, said implant being of a porous, non- 
porous, or gelatinous material, including membranes, such as sialastic membranes, or fibers. 
[0271] In certain embodiments, it may be desirable to introduce the compound or a 
pharmaceutically acceptable salt thereof into the central nervous system by any suitable 
route, including intraventricular, intrathecal and epidural injection. Intraventricular 
injection may be facilitated by an intraventricular catheter, for example, attached to a 
reservoir, such as an Ommaya reservoir. 

[0272] Pulmonaiy administration can also be employed, e.g., by use of an inhaler or 
nebuhzer, and formulation with an aerosolizing agent, or via perfusion in a fluorocarbon or 
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synthetic pulmonary surfactant. In certain embodiments, the compound and 
pharmaceutically acceptable salts thereof can be formulated as a suppository, with 
traditional binders and vehicles such as triglycerides. 

[0273] In anotlier embodiment, the compound and pharmaceutically acceptable salts 
thereof can be delivered in a vesicle, in particular a liposome (see Langer, 1990, Science 
249: 1527-1533; Treat et al, 1989, in Liposomes in the Therapy of Infectious Disease and 
Cancer, Lopez-Berestein and Fidler (eds.), Liss, New York, pp. 353-365; and 
Lopez-Berestein, ibid., pp. 317-327; see generally zZ?fc/.). 

[0274] In yet another embodiment, the compound and pharmaceutically acceptable salts 
thereof can be delivered in a controlled release system (see, e.g. , Goodson, 1 984, in Medical 
Applications of Controlled Release, supra, vol. 2, pp. 1 15-138). Other controUed-release 
systems discussed in the review by Langer, 1990, Science 249:1527-1533 may be used. In 
one embodiment, a pump maybe used (see Langer, supra; Sefton, 1987, CRC Crit. Ref 
Biomed. Eng. 14:201; Buchwald et al., 1980, Surgery 88:507; Saudek et al., 1989, N. Engl. 
J. Med. 321 :574). In another embodiment, polymeric materials can be used (see Medical 
Applications of Controlled Release, Langer and Wise (eds.), 1974, CRC Pres., Boca Raton, 
Florida; Controlled Drug Bioavailability, Drug Product Design and Performance, Smolen 
and Ball (eds.), 1984, Wiley, New York; Ranger & Peppas, 1983, J. Macromol. Sci. Rev. 
Macromol. Chem. 23:61; see also Levy et al., 1985, Science 228:190; During et al., 1989, 
Aon. Neurol. 25:351; Howard et al., 1989, J. Neurosm-g. 71:105). In yet another 
embodiment, a controUed-release system can be placed in proximity of a target RNA of the 
compound or a pharmaceutically acceptable salt thereof, thus requiring only a fraction of 
the systemic dose. 

[0275] Compositions comprising the compound or a pharmaceutically acceptable salt 
thereof ("compound compositions") can additionally comprise a suitable amount of a 
pharmaceutically acceptable vehicle so as to provide the form for proper administration to 
the patient. 

[0276] In a specific embodiment, the term "pharmaceutically acceptable" means 
approved by a regulatory agency of the Federal or a state government or listed in the U.S. 
Pharmacopeia or other generally recognized pharmacopeia for use in animals, mammals, 
and more particularly in humans. The term "vehicle" refers to a diluent, adjuvant, 
excipient, or carrier with which a compound of the invention is administered. Such 
pharmaceutical vehicles can be liquids, such as water and oils, including those of 
petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral 
oil, sesame oil and the like. The pharmaceutical vehicles can be saline, gum acacia, gelatin, 
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Starch paste, talc, keratin, colloidal siHca, urea, aad the like. In addition, aiixiHary, 
stabilizing, thickening, lubricating and coloring agents may be used. When administered to 
a patient, the pharmaceutically acceptable vehicles are preferably sterile. Water is a 
preferred vehicle when the compound of the invention is administered intravenously. 
Saline solutions and aqueous dextrose and glycerol solutions can also be employed as liquid 
vehicles, particularly for injectable solutions. Suitable pharmaceutical vehicles also include 
excipients such as starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, sihca 
gel, sodium stearate, glycerol monostearate, talc, sodium chloride, dried skim milk, 
glycerol, propylene, glycol, water, ethanol and the like. Compound compositions, if 
desired, can also contain minor amounts of wetting or emulsifying agents, or pH buffering 
agents. 

[0277] Compound compositions can take the form of solutions, suspensions, emulsion, 
tablets, pills, pellets, capsules, capsules containing liquids, powders, sustained-release 
formulations, suppositories, emulsions, aerosols, sprays, suspensions, or any other form 
suitable for use. In one embodiment, the pharmaceutically acceptable vehicle is a capsule 
(see e.g., U.S. Patent No. 5,698,155). Other examples of suitable pharmaceutical vehicles 
are described in Remington's Phannaceutical Sciences, Alfonso R. Gennaro, ed.. Mack 
PubUshing Co. Easton, PA, 19th ed., 1995, pp. 1447 to 1676, incorporated herein by 
reference. 

[0278] In a preferred embodiment, the compound or a pharmaceutically acceptable salt 
thereof is formulated in accordance with routine procedures as a pharmaceutical 
composition adapted for oral administeation to human beings. Compositions for oral 
delivery may be in the form of tablets, lozenges, aqueous or oily suspensions, granules, 
powders, emulsions, capsules, syrups, or elixirs, for example. Orally administered 
compositions may contain one or more agents, for example, sweetening agents such as 
fructose, aspartame or saccharin; flavoring agents such as peppermint, oil of wintergreen, or 
cherry; coloring agents; and preserving agents, to provide a pharmaceutically palatable 
preparation. Moreover, where in tablet or pill fomi, the compositions can be coated to 
delay disintegration and absorption in the gastrointestinal tract thereby providing a 
sustained action over an extended period of time. Selectively permeable membranes 
surrounding an osmotically active driving compound are also suitable for orally 
administered compositions, hi these later platforms, fluid from the environment 
surrounding the capsule is imbibed by the driving compound, which swells to displace the 
agent or agent composition through an aperture. These delivery platforms can provide an 
essentially zero order delivery profile as opposed to the spiked profiles of unmediate release 
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formulations. A time delay material such as glycerol monostearate or glycerol stearate may 
also be used. Oral compositions can include standard vehicles such as maimitol, lactose, 
starch, magnesium stearate, sodiiun saccharine, cellulose, magnesium carbonate, and the 
like. Such vehicles are preferably of pharmaceutical grade. Typically, compositions for 
intravenous administration comprise sterile isotonic aqueous buffer. Where necessary, the 
compositions may also include a solubiliziag agent. 

[0279] In another embodiment, the compound or a pharmaceutically acceptable salt 
thereof can be formulated for intravenous administration. Compositions for intravenous 
administration may optionally include a local anesthetic such as lignocaine to lessen pain at 
the site of the injection. Generally, the ingredients are supplied either separately or mixed 
together in unit dosage form, for example, as a dry lyopliilized powder or water-free 
concentrate in a hermetically sealed container such as an ampoule or sachette indicating the 
quantity of active agent. Wliere the compound or a pharmaceutically acceptable sah thereof 
is to be administered by infusion, it can be dispensed, for example, with an infusion bottle 
containing sterile pharmaceutical gi-ade water or saline. Where the compound or a 
pharmaceutically acceptable sah thereof is administered by injection, an ampoule of sterile 
water for injection or saline can be provided so tliat the ingredients may be mixed prior to 
administration. 

[0280] The amount of a compound or a phamiaceutically acceptable salt thereof that 
will be effective in the treatment of a particular disease will depend on the nature of the 
disease, and can be determined by standard clinical techniques. In addition, in vitro or in 
vivo assays may optionally be employed to help identify optimal dosage ranges. The 
precise dose to be employed will also depend on the route of administration, and the 
seriousness of the disease, and should be decided according to the judgment of the 
practitioner and each patient's circumstances. However, suitable dosage ranges for oral 
administration are generally about 0.001 milligram to about 500 milligrams of a compound 
or a pharmaceutically acceptable sah thereof per kilogi'am body weight per day. hi specific 
preferred embodiments of the invention, the oral dose is about 0.01 milligram to about 500 
milhgrams per kilogram body weight per day, about 0.01 miUigram to about 250 milhgram 
per kilogram body weight per day, about 0.01 milligram to about 100 milhgrams per 
kilogram body weight per day, more preferably about 0.1 miUigram to about 75 milhgrams 
per kilogram body weight per day, more preferably about 0.5 milhgram to 5 milligrams per 
kilogram body weight per day. The dosage amounts described herein refer to total amoxmts 
administered; that is, if more than one compound is administered, or if a compound is 
administered with a therapeutic agent, then the preferred dosages correspond to the total 
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amount administered. Oral compositions preferably contain about 10% to about 95% active 

ingredient by weight. 

[0281] Suitable dosage ranges for intravenous (i.v.) administration are about 0.01 
milligram to about 100 milligrams per kilogram body weight per day, about 0.1 milligram 
to about 35 milligrams per kilogram body weight per day, and about 1 milligram to about 
10 milligrams per kilogram body weight per day. Suitable dosage ranges for intranasal 
administration are generally about 0.01 pg/kg body weight per day to about 1 mg/kg body 
weight per day. Suppositories generally contain about 0.01 milligram to about 50 
milligrams of a compound of the invention per kilogram body weight per day and comprise 
active ingredient in the range of about 0.5% to about 10% by weight. 
[0282] Recommended dosages for intradermal, mtramuscular, intraperitoneal, 
subcutaneous, epidural, sublingual, intracerebral, intravaginal, transdermal administration 
or administration by inhalation are in the range of about 0.001 milligram to about 200 
milligrams per Icilogi-am of body weight per day. Suitable doses for topical administration 
are in the range of about 0.001 miUigram to about 1 milligram, depending on the area of 
administration. Effective doses may be extrapolated from dose-response curves derived 
from in vitro or animal model test systems. Such animal models and systems are well 
known in the art. 

[0283] The compound and pharaiaceutically acceptable salts thereof are preferably 
assayed in vitro and in vivo, for the desired therapeutic or prophylactic activity, prior to use 
in humans. For example, in vitro assays can be used to determine whether it is preferable to 
administer the compound, a pharmaceutically acceptable salt thereof, and/or another 
therapeutic agent. Animal model systems can be used to demonstrate safety and efficacy. 

5.9. Target Diseases or Disorders 

[0284] The present invention provides methods for preventing, treating or ameliorating 

a disease or disorder or one or more symptoms thereof, said methods comprising 

administering to a subject in need thereof one or more compounds identified in accordance 

with the methods of tibie invention. In one embodiment, the invention provides a method of 

preventing, treating or ameliorating a disease or disorder or one or more symptoms thereof, 

said method comprising administering to a subject in need thereof a dose of a 

prophylactically or therapeutically effective amount of one or more compounds identified in 

accordance with the methods of the invention. In another embodiment, the invention 

provides a method of preventing, treating or ameliorating a disease or disorder or one or 

more symptoms thereof, said method comprishig administering to a subject m need thereof 

a dose of a prophylactically or therapeutically effective amount of one or more compounds 
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identified in the assays described herein, said compounds increasing untranslated region- 
dependent expression of a target gene whose expression useful in the prevention or 
treatment of said disease or disorder, hi another embodiment, the invention provides a 
method of preventing, treating or ameUoratmg a disease or disorder or one or more 
symptoms thereof, said method comprisuig administering to a subject in need thereof a dose 
of a prophylactically or therapeutically effective amount of one or more compounds 
identified in the assays described, said compounds decreasing untranslated region- 
dependent expression of a target gene whose expression is associated with the onset, 
progression, development and/or severity of said disease or disorder. In a specific 
embodiment, a compound identified in accordance with the methods of the invention is not 
administered to prevent, treat, or ameliorate a disease or disorder or one or more symptoms 
thereof, if such compound has been used previously to prevent, treat or ameUorate said 
disease or disorder. 

[0285] The invention also provides methods of preventing, treating or ameliorating a 
disease or disorder or one or more symptoms thereof, said methods comprising 
administering to a subject in need thereof one or more of the compounds identified utihzing 
the screening methods described herein, and one or more other therapies (e.g., prophylactic 
or therapeutic agents). Preferaby, such therapies (e.g., prophylactic or therapeutic agents) 
are currentiy being used, have been used or are known to be useful in the prevention, 
treatment or amelioration of one or more symptoms associated with said disease or disorder. 
The therapies {e.g., prophylactic or therapeutic agents) of the combination therapies of the 
invention can be administered sequentially or concurrently, hi a specific embodiment, the 
combination therapies of the invention comprise a compound of the invention and at least 
one other therapy (e.g., prophylactic or therapeutic agent) which has a different mechanism 
of action than said compound. The combination therapies of the present invention improve 
the prophylactic or therapeutic effect of a compound of the invention by functioning 
together with the compound to have an additive or synergistic effect. The combination 
therapies of the present invention reduce the side effects associated with the therapies (e.g., 
prophylactic or therapeutic agents). 

[0286] The prophylactic or therapeutic agents of the combination therapies can be 
administered to a subject in the same pharmaceutical composition. Alternatively, the 
prophylactic or therapeutic agents of the combination therapies can be administered 
concmxently to a subject in separate pharmaceutical compositions. The prophylactic or 
therapeutic agents may be administered to a subject by the same or different routes of 
admiiustiation. 
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[0287] la a specific embodiment, a pharmaceutical composition comprising one or 
more compounds identified in a screening assay described herein is administered to a 
subject, preferably a human, to prevent, treat or ameliorate a disease or disorder or a 
symptom thereof, hi accordance with the invention, pharmaceutical compositions of the 
invention may also comprise one or more prophylactic or therapeutic agents which are 
currently being used, have been used or are known to be useful in the prevention, treatment 
or amelioration of one or more symptoms associated with a disease or disorder. 

5.9.1. Proliferative Disorders 
[0288] A compound identified in accordance with the methods of the invention may be 
administered to a subject in need tiiereof to prevent, treat or ameUorate a cancer or one or 
more symptoms thereof. A compound identified in accordance with the methods of the 
invention may also be administered in combination with one or more other therapies (e.g., 
prophylactic or therapeutic agents) to a subject in need thereof to prevent, treat or 
ameUorate a cancer or one or more symptoms thereof. Preferably, such therapies are usefijl 
for the prevention or treatment of cancer. Examples of such therapies include, but are not 
Limited to chemotherapeutic agents {e.g., acivicin, anthramycin, bleomycin sulfate, 
carbetimer, carboplatin, cisplatin, cyclophosphamide, daunorubicin hydrochloride, 
docetaxel, doxorubicin, doxorubicin hydrochloride, epipropidine, etoposide, etoposide 
phosphate, etoprine fluorouracil, gemcitabine, gemcitabine hydrochloride, hydroxyurea, 
idarubicin hydrochloride, ifosfamide, ilmofosine, methotrexate, methotrexate sodium, 
paclitaxel, trunetrexate, trimetrexate glucuronate, vinblastine sulfate, vincristine sulfate, 
vindesine, vindesine sulfate, and vinepidine sulfate) and anti-angiogenic agents {e.g., 
angiostatin (plasminogen firagment), antiangiogenic antithrombin HI, angiozyme, 
combretastatin A-4, endostatin (collagen XVm firagment), and fibronectin fi-agment). In a 
specific embodiment, the invention provides a method of preventing, treating or 
ameliorating cancer or one or more symptoms thereof, said method comprising 
administering to a subject in need thereof a dose of a prophylactically or therapeutically 
effective amount of a compound identified in accordance with the methods of the invention. 
In another embodiment, the invention provides a method of preventing, treating or 
ameliorating cancer or one or more symptoms thereof, said method comprising 
administering to a subject in need thereof a dose of a prophylactically or therapeutically 
effective amount of a compound identified in accordance with the methods of the invention 
and a dose of a prophylactically or therapeutically effective amount of one or more other 
therapies (e.g., prophylactic or therapeutic agents). 
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[0289] A compound identified in accordance with the methods of the invention may be 
used as a first, second, third or fourth Une of therapy for the treatment of cancer. The 
invention provides methods for treating or ameUorating cancer or a syptom thereof in a 
subject refractory to conventional therapies for such a cancer, said methods comprising 
administering to said subject a dose of aprophylactically or therapeutically effective 
amount of a compound identified in accordance with the methods of the invention. A 
cancer may be determined to be refiractory to a therapy means when at least some 
significant portion of the cancer cells are not killed or their cell division arrested in response 
to the therapy. Such a determination can be made either in vivo or in vitro by any method 
known in the art for assaying the effectiveness of treatment on cancer cells, using the art- 
accepted meanings of "refractory" in such a context, hi a specific embodiment, a cancer is 
refractory where the number of cancer cells has not been significantly reduced, or has 
increased. 

[0290] The invention provides methods for treating or ameliorating cancer or a 
symptom thereof in a subject refractory to existing single agent therapies for such a cancer, 
said methods comprising administering to said subject a dose of a prophylactically or 
therapeutically effective amount of a compound identified in accordance with the methods 
of the invention and a dose of a prophylactically or therapeutically effective amount of one 
or more other therapies (e.g., prophylactic or therapeutic agents). The invention also 
provides methods for treating cancer by administering a compound identified in accordance 
with the methods of the invention in combination with any other therapy {e.g., radiation 
therapy, chemotherapy or surgery) to patients who have proven refractory to other therapies 
but are no longer on these therapies. The invention also provides methods for the treatment 
of a patient having cancer and imniunosuppressed by reason of having previously 
undergone other cancer therapies. The invention also provides alternative methods for the 
treatment of cancer where chemotherapy, radiation therapy, hormonal therapy, and/or 
biological therapy/immunotherapy has proven or may prove too toxic, i.e., results in 
unacceptable or unbearable side effects, for the subject being treated. Further, the invention 
provides methods for preventing the recurrence of cancer in patients that have been treated 
and have no disease activity by administering a compound identified in accordance with the 
methods of the invention. 

[0291] In this embodiment, target genes encoding proteins include, but are not limited 
to, angiogenin; angiopoietinl ; angiopoietin2; antigen CD82; aryl hydrocarbon receptor 
nuclear translocator; B cell lymphoma 2; beta-catenin; cadherin-1; CLCA homolog; 
connective tissue growth factor, cysteine-rich 61; cyclin Dl; cyclin-dependent kinase 
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inhibitor 2A, CDKJN2 UDK4 mhibitor multiple tumor suppressor 1, SOR 1, MTS 1TP16 
pl6(INK4) pl6(INK4A) pl4(ARF); cyclin-dependent kinase inhibitor lA (p21, Cipl); 
dihydrofolate reductase; DNA methytoansferase; effector cell protease receptor; 
EMMPRIN; epithelial growth factor receptor; fibroblast growth factor 2; fibroblast growth 
factor 1; FMS-related tyrosine kinase 1; heparanase; hepsin; Her-2; histone 
acetyltraiisferase; histone deacetylaseS; histone deacetylase 1; Hu Antigen R, a member of 
the Elav (embryonic lethal abnonnal vision) family of RNA-binding proteins; hypoxia- 
inducible factor 1 -alpha inhibitor; hypoxia-inducible factor 1; insulin like growth factor 1 
receptor, IGF-IR; insulin-like gi-owth factor 1; insulin- like growth factor binding protein-2; 
interleukin 2; interleukin-8 precursor (il-8) (monocyte-derived neutrophil chemotactic 
factor) (mdiisf) (T-cell chemotactic factor) (neutrophil-activating protein 1) (nap-1) 
(lymphoc)d:e-derivedneutrophil-activating factor) (lynap) (protein 3-1 Oc) (neutrophil- 
activating factor) (naf) (granulocyte chemotactic protein 1) (gsp-1); kit ligand, stem cell 
factor; large tumor suppressor; leucine amino peptidase-3; livin; major histocompatibility 
complex class I chain-related gene B; major histocompatibility complex class I chain- 
related gene A; matrix metalloproteinase 9; matrix metalloproteinase 12; max interacting 
protein 1 (mxi 1 protein); methyl-CpG-binding endonuclease; NF-Kappa-B; oncoprotein 
MDM2; oncoprotein fos; P-glycoprotein-1 (PGYl); placental gi-owth factor; plasminogen 
activator inhibitor protein; platelet derived growth factor, beta chain; pleiotrophin; 
progranulin; proliferating cell nuclear antigen; protein kinase B/Akt (PBK), v-akt murine 
thymoma viral oncogene homolog 1, oncogene aktl protein kinase b, pkb rac 
serine/threonine protein kinase; protein-tyro sine phosphatase, 4A, 3, PTP4A3; ras; 
retinoblastoma-binding protein 1-like 1; ribonuclease/angiogenin inhibitor; soluble-type 
polypeptide FZD4S; src, oncogen src protooncogene src src oncogene avian sarcoma; TEK 
tyrosine kinase; tlirombopoietin (TPO); TIAMl : T-cell lymphoma invasion and metastasis 
1; tissue inhibitor of metalloprotease 1; tissue inhibitor of metalloprotease 2; tissue inhibitor 
of metalloprotease 4; transforming growth factor, beta-1; tumor necrosis factor receptor 
superfamily, member 5, TNFRSF5; urokinase plasminogen activator; and v-myc 
myelocytomatosis viral oncogene homolog. 

[0292] In a specific embodiment, a compound identified in the assays described herein 
to down-regulate imtranslated region-dependent VEGF expression may be used to prevent, 
treat or ameliorate a vascularized tumor or one or more symptoms thereof In another 
embodiment, a compound not previously known to affect VEGF expression which was 
identified in the assays described herein to down-regulate untranslated region-dependent 
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VEGF may be used to prevent, treat or ameliorate a vascularized tumor or one or more 
symptoms thereof. 

[0293] In another embodiment, a compound identified in the assays described herein to 
down-regulate untranslated region-dependent survivin expression may be used to prevent, 
treat or ameliorate cancer (in particular, cancer in which survivin is highly expressed) or 
one or more symptoms thereof. In another embodiment, a compound not previously known 
to affect survivin expression which was identified in the assays described herein to down- 
regulate untranslated region-dependent smrvivin may be used to prevent, treat or ameliorate 
cancer or one or more symptoms thereof. 

[0294] In another embodiment, a compound identified in the assays described herein to 
down-regulate untranslated region-dependent Her-2 expression in breast cancer cells may 
be used to prevent, treat or ameliorate breast cancer (in particular, Her-2 positive breast 
cancer) or one or more symptoms thereof. In another embodiment, a compound not 
previously known to affect Her-2 expression which was identified in the assays described 
herein to down-regulate untranslated region-dependent Her-2 expression may be used to 
prevent, treat or ameliorate breast cancers or one or more symptoms thereof. 
[0295] Cancers that can be treated by the methods encompassed by the invention 
include, but are not limited to, neoplasms, tumors, metastases, or any disease or disorder 
characterized by uncontrolled cell growth. The cancer may be a primary or metastatic 
cancer. Specific examples of cancers that can be treated by the methods encompassed by 
the invention include, but are not limited to, cancer of the head, neck, eye, mouth, tliroat, 
esophagus, chest, bone, lung, colon, rectum, stomach, prostate, breast, ovaries, kidney, 
liver, pancreas, and brain. Additional cancers include, but are not limited to, the following: 
leukemias such as but not limited to, acute leukemia, acute lymphoc)4ic leukemia, acute 
myelocytic leukemias such as myeloblastic, promyelocytic, myelomonocytic, monocytic, 
erythroleukemia leukemias and myelodysplastic syndrome, chronic leukemias such as but 
not limited to, chronic myelocytic (granulocytic) leukemia, clxronic lymphocytic leukemia, 
hairy cell leukemia; polycythemia vera; lymphomas such as but not limited to Hodgkin's 
disease, non-Hodgkin's disease; multiple myelomas such as but not limited to smoldering 
multiple myeloma, nonsecretory myeloma, osteosclerotic myeloma, plasma cell leukemia, 
solitary plasmacytoma and extramedullary plasmacytoma; Waldenstrom's 
macroglobulinemia; monoclonal gammopathy of undetermined significance; benign 
monoclonal gammopathy; heavy chain disease; bone and connective tissue sarcomas such 
as but not limited to bone sarcoma, osteosarcoma, chondrosarcoma, Ewing's sarcoma, 
malignant giant cell tumor, fibrosarcoma of bone, chordoma, periosteal sarcoma, soft-tissue 
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sarcomas, angiosarcoma i^nemangiosarcoma), fibrosarcoma, Kaposi's sarcoma, 
leiomyosarcoma, liposarcoma, l^mpliangiosarcoma, neurilemmoma, rhabdomyosarcoma, 
synovial sarcoma; brain tumors such as but not limited to, glioma, astrocytoma, brain stem 
glioma, ependymoma, ohgodendroglioma, nonglial tumor, acoustic neurinoma, 
craniopharyngioma, meduUoblastoma, meningioma, pineocytoma, pineoblastoma, primary 
brain lymphoma; breast cancer including but not limhed to adenocarcinoma, lobular (small 
cell) carcinoma, intraductal carcinoma, medullary breast cancer, mucinous breast cancer, 
tubular breast cancer, papillary breast cancer, Paget' s disease, and inflammatory breast 
cancer; adrenal cancer such as but not limited to pheochromocytom and adrenocortical 
carcinoma; thyroid cancer such as but not limited to papillary or follicular thyroid cancer, 
medullary thyroid cancer and anaplastic thyroid cancer; pancreatic cancer such as but not 
limited to, insulinoma, gastrinoma, glucagonoma, vipoma, somatostatin-secreting tumor, 
and carcinoid or islet cell tumor; pituitary cancers such as but limited to Gushing' s disease, 
prolactin-secreting tumor, acromegaly, and diabetes insipius; eye cancers such as but not 
limited to ocular melanoma such as iris melanoma, choroidal melanoma, and cilliary body 
melanoma, and retinoblastoma; vaginal cancers such as squamous cell carcinoma, 
adenocarcinoma, and melanoma; vulvar cancer such as squamous cell carcinoma, 
melanoma, adenocarcinoma, basal cell carcinoma, sarcoma, and Paget's disease; cervical 
cancers such as but not limited to, squamous cell carcinoma, and adenocarcinoma; uterine 
cancers such as but not limited to endometrial carcinoma and uterine sarcoma; ovarian 
cancers such as but not limited to, ovarian epithelial carcinoma, borderline tumor, germ cell 
tumor, and stromal tumor; esophageal cancers such as but not limited to, squamous cancer, 
adenocarcinoma, adenoid cyctic carcinoma, mucoepidermoid carcinoma, adenosquamous 
carcinoma, sarcoma, melanoma, plasmacjloma, verrucous carcinoma, and oat cell (small 
cell) carcinoma; stomach cancers such as but not limited to, adenocarcinoma, fungating 
(polypoid), ulcerating, superficial spreading, diffusely spreading, malignant lymphoma, 
liposarcoma, fibrosarcoma, and cai-cinosarcoma; colon cancers; rectal cancers; liver cancers 
such as but not limited to hepatocellular carcinoma and hepatoblastoma, gallbladder cancers 
such as adenocarcinoma; cholangiocarcinomas such as but not limited to pappillary, 
nodular, and diffuse; lung cancers such as non- small cell lung cancer, squamous cell 
carcinoma (epidermoid carcinoma), adenocarcinoma, large-cell carcinoma and small-cell 
lung cancer; testicular cancers such as but not limited to germinal tumor, seminoma, 
anaplastic, classic (typical), spermatocytic, nonseminoma, embryonal carcinoma, teratoma 
carcinoma, choriocarcinoma (yolk-sac tumor), prostate cancers such as but not limited to, 
adenocarcinoma, leiomyosarcoma, and rhabdomyosarcoma; penal cancers; oral cancers 
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sucfi as 'but not limited to squamous cell carcinoma; basal cancers; salivary gland cancers 
such as but not limited to adenocarcinoma, mucoepidermoid carcinoma, and adenoidcystic 
carcinoma; pharynx cancers such as but not limited to squamous cell cancer, and verrucous; 
skin cancers such as but not limited to, basal cell carcinoma, squamous cell carcinoma and 
melanoma, superficial spreading melanoma, nodular melanoma, lentigo malignant 
melanoma, acral lentiginous melanoma; kidney cancers such as but not limited to renal cell 
cancer, adenocarcinoma, hypernephroma, fibrosarcoma, transitional cell cancer (renal 
pelvis and/or uterer); Wilms' tumor; bladder cancers such as but not limited to transitional 
cell carcinoma, squamous cell cancer, adenocarcinoma, carcinosarcoma. In addition, 
cancers include myxosarcoma, osteogenic sarcoma, endothehosarcoma, 
lymphangioendotheliosarcoma, mesothelioma, synovioma, hemangioblastoma, epithehal 
carcinoma, cystadenocarcinoma, bronchogenic carcinoma, sweat gland carcinoma, 
sebaceous gland carcinoma, papillary carcinoma and papillary adenocarcinomas (for a 
review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., 
Philadelphia and Murphy et al., 1997, Informed Decisions: The Complete Book of Cancer 
Diagnosis, Treatment, and Recovery, Viking Penguin, Penguin Books U.S.A., Inc., United 
States of America). It is also contemplated that cancers caused by aberrations in apoptosis 
can also be treated by the methods and compositions of the invention. Such cancers may 
include, but not be limited to, foUicular lymphomas, carcinomas with p53 mutations, 
hormone dependent tumors of the breast, prostate and ovary, and precancerous lesions such 
as familial adenomatous polyposis, and myelodysplastic syndromes. 
[0296] Anti-cancer therapies and their dosages, routes of administration and 
. rcconmicnded usage are known in the art and have been described in such Uterature as the 
Physician 's Desk Reference (56"' ed., 2002). 

5.9.2. Inflammatory Disorders 

[0297] A compound identified in accordance with the methods of the invention may be 

administered to a subject in need thereof to prevent, treat or ameliorate an inflammatory 

disorder or one or more symptoms thereof. A compound identified in accordance with the 

methods of the invention may also be administered in combination with one or more other 

therapies (e.g., prophylactic or therapeutic agents) to a subject in need thereof to prevent, 

treat or ameliorate an inflammatory disorder or one or more symptoms thereof. Preferably, 

such therapies are usefiil for the prevention or treatment of an inflammatory disorder. 

Examples of such therapies include, but are not limited to, immunomodulatory agents (e.g., 

methothrexate, leflunomide, cyclophosphamide, Cytoxan, Immuran, cyclosporine A, 

minocyclme, azathioprine, and antibiotics (e.g., FK506 (tacrolimus)), anti-angiogenic 
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agems [^e.g., enaosxaxm, angiosiatin, apomigren, anti-angiogenic antithrombin III, the 29 
kDa N-terminal and a 40 kDa C-terminal proteolytic jfragments of fibronectin, the anti- 
angiogenic factor designated 13.40, the anti-angiogenic 22 amino acid peptide fragment of 
thrombospondin I, the anti-angiogenic 20 amino acid peptide fragment of SPARC, RGD 
and NGR containing peptides, the small anti-angiogenic peptides of laminin, fibronectin, 
procollagen and EGF, acid fibroblast gi'owth factor ("aFGF") antagonists, basic fibroblast 
growth factor ("bFGF") antagonists, vascular endothelial growth factor ("VEGF") 
antagonists, and VEGF receptor ("VEGFR") antagonists (e.g., anti-VEGFR antibodies), 
TNF-O! antagonists (e.g., infliximab (REMICADE™; Centacor), D2E7 (Abbott 
Laboratories/Knoll Pharmaceuticals Co., Mt. Olive, NJ.), CDP571 which is also known as 
HUMICADE™ and CDP-870 (both of CellteclVPharmacia, Slough, U.K.), and TN3-19.12 
(WiUiams et al., 1994, Proc. Natl. Acad. Sci. USA 91: 2762-2766; Thorbecke et al, 1992, 
Proc. Natl. Acad. Sci. USA 89:7375-7379), non-steroidal anti-inflammatory drugs 
(NSAlDs) (e.g., aspirin, ibuprofen, celecoxib (CELEBREX™), diclofenac 
(VOLTARENTM), etodolac (LODINE™), fenoprofen (NALFONtm), indomethacin 
(INDOCIN™), ketoralac (TORADOL™), oxaprozin (DAYPRQtm), nabumentone 
(RELAFENTM), sulmdac (CLINORIL™), tolmentin (TOLECTIN™), rofecoxib 
(VIOXX™), naproxen (ALEVEtm, NAPROSYN^m), ketoprofcn (ACTRON™) and 
nabumetone (RELAFEN'^'^)), steroidal anti-inflammatory drugs (e.g., glucocorticoids, 
dexamethasone (DECADRON^'^), cortisone, hydrocortisone, prednisone 
(DELTASONE'T'^), prednisolone, triamcinolone, azulfidine, and eicosanoids such as 
prostaglandins, thromboxanes, and leukotriencs), beta- agonists, anticholingeric agents, and 
methyl xanthines. In a specific embodiment, the invention provides a method of 
preventing, freating or ameliorating an inflammatory disorder or one or more symptoms 
thereof, said method comprising administering to a subject in need thereof a dose of a 
prophylactically or therapeutically effective amount of a compound identified in accordance 
with the methods of the invention. In another embodiment, the invention provides a method 
of preventing, freating or ameliorating an inflammatory disorder or one or more symptoms 
thereof, said method comprising administering to a subject in need thereof a dose of a 
prophylactically or therapeutically effective amount of a compomad identified in accordance 
with the methods of the invention and a dose of a prophylactically or therapeutically 
effective amount of one or more other therapies (e.g., prophylactic or therapeutic agents). 
[0298] The invention provides methods for freating or ameHorating an inflammatory 
disorder or a syptom thereof in a subject refractory to conventional therapies (e.g., 
methofrexate and a TNF-a antagonist (e.g.,REMICADE™ or ENBREL™)) for such an 
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iMlammatory disorder, said methods comprising administering to said subject a dose of a 
prophylactically or therapeutically effective amount of a compound identified in accordance 
with the methods of the invention. The invention also provides methods for treating or 
ameliorating an inflammatory disorder or a symptom thereof in a subject refractory to 
existing single agent therapies for such an inflammatory disorder, said methods comprising 
administering to said subject a dose of a prophylactically or therapeutically effective 
amount of a compound identified in accordance with the methods of the invention and a 
dose of a prophylactically or therapeutically effective amount of one or more other 
therapies (e.g., prophylactic or therapeutic agents). The invention also provides methods 
for treating an inflammatory disorder by administering a compound identified in accordance 
with the methods of the invention in combination with any other therapy to patients who 
have proven refractory to other therapies but are no longer on these therapies. The 
invention also provides alternative methods for the treatment of an inflammatory disorder 
where another therapy has proven or may prove too toxic, i.e., results in unacceptable or 
unbearable side effects, for the subject being treated. Further, the invention provides 
methods for preventing the recurrence of an inflammatory disorder in patients that have 
been treated and have no disease activity by administering a compound identified in 
accordance with the methods of the invention. 

[0299] In this embodiment, target genes encoding proteins include, but are not limited 
to, a disintegrin and metallo proteinase domain 33; angiopoietinl; angiopoietin2; beta- 
catenin; chemokine (C-C) receptor; eotaxin; fibroblast growth factor 1 ; fibroblast growth 
factor 2; FMS-related tyrosine kinase 1 ; granulocyte - macrophage colony-stimulating 
factor precursor (GM-CSF) (colony-stimulating factor) (CSF) (sargramostim); GR02 
oncogene; macrophage inflammatory protein-2-alpha precursor (mip2-alpha) (growth 
regulated protein beta) (gro-beta); Hu antigen R; a member of the Elav (Embryonic lethal 
abnormal vision) family of RNA-binding proteins; insulin-like growth factor 1; interferon 
inducible protein; interferon 1 beta; interferon- alpha; interleukin 17F; interleukin 1-beta; 
interleukin 6; interleukin 10; interleukin 18; inteiieulcin 13; interleukin 4; interleukin-9; 
leukemia Inhibitory factor Receptor; leukemia inhibitory factor; linker for Activation of T 
cells; macrophage migration inhibitory factor; monocyte chemotactic protein 1; NF-Kappa- 
B; nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (pl05, NF-kappaB); 
osteopontin; p38 MAP KLinase; placental growth factor; platelet derived growth factor, beta 
chain; pleiotrophin; prolactin; receptor for interleukin-4; signal transducer and activator of 
transcription 6; TEK tyrosine kinase; and tumor necrosis factor alpha. 
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[0300] Inflaniniatory disorders that can be treated by the methods encompassed by the 
invention include, but are not limited to, asthma, encephilitis, inflammatory bowel disease, 
chronic obstructive puhnonary disease (COPD), allergic disorders, septic shock, pulmonary 
fibrosis, undifferentitated spondyloarthropathy, undifferentiated arthropathy, arthritis, 
inflammatory osteolysis, and chronic inflammation resulting from chronic viral or bacteria 
infections. Some autoimmune disorders are associated with an inflammatory condition, and 
thus, can be characterized as either or both an autoimmune disorder and/or an inflammatory 
disorder. 

[0301] Anti-inflammatory therapies and their dosages, routes of administration and 
recommended usage are known in the art and have been described in such literature as the 
Physician 's Desk Reference {56^^ ed., 2002). 

5.9.3. Autoimmune Disorders 

[0302] A compound identified in accordance with the methods of the invention may be 

administered to a subject in need thereof to prevent, treat or ameliorate an autoimmtme 

disorder or one or more symptoms thereof. A compound identified in accordance with the 

methods of the invention may also be administered in combination with one or more other 

therapies (e.g., prophylactic or therapeutic agents) to a subject in need thereof to prevent, 

treat or amehorate an autoimmune disorder or one or more symptoms thereof Preferably, 

such therapies are usefiil for the prevention or treatment of an autoimmune disorder. In a 

specific embodiment, the invention provides a method of preventing, treating or 

ameliorating an autoimmune disorder or one or more symptoms thereof, said method 

comprising administering to a subject in need thereof a dose of a prophylactically or 

therapeutically effective amount of a compound identified in accordance with the methods 

of the invention, hi another embodiment, the invention provides a method of preventing, 

treating or ameliorating an autoimmune disorder or one or more symptoms thereof, said 

method comprising administering to a subject in need thereof a dose of a prophylactically or 

therapeutically effective amount of a compound identified in accordance with the methods 

of the invention and a dose of a prophylactically or therapeutically effective amount of one 

or more other therapies (e.g., prophylactic or therapeutic agents). 

[0303] The invention provides methods for treating or ameliorating an autoinmiime 

disorder or a symptom thereof in a subject refractory to conventional therapies for such an 

autoimmune disorder, said methods comprising administering to said subject a dose of a 

prophylactically or therapeutically effective amount of a compound identified in accordance 

with the methods of the invention. The mvention also provides methods for freating or 

amelioratuig an autoimmune disorder or a symptom thereof in a subject refractory to 
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existing single agent therapies lor such an autoimmune disorder, said methods comprising 
administering to said subject a dose of a prophylactically or therapeutically effective 
amount of a compound identified in accordance with the methods of the invention and a 
dose of a prophylactically or therapeutically effective amount of one or more other 
therapies (e.g., prophylactic or therapeutic agents). The invention also provides methods 
for treating an autoimmune disorder by administering a compound identified in accordance 
with the methods of the invention in combination with any other therapy to patients who 
have proven refractory to other therapies but are no longer on these therapies. The 
invention also provides alternative methods for the treatment of an autoimmune disorder 
where another therapy has proven or may prove too toxic, i.e., results in unacceptable or 
unbearable side effects, for the subject being treated. Further, the invention provides 
methods for preventing the recuiTeiice of an autoimmune disorder in patients that have been 
treated and have no disease activity by administering a compound identified in accordance 
with tlae methods of the invention. 

[0304] In this embodiment, target genes encoding proteins include, but are not limited 
to, adiponectin; alpha-glucosidase; forkhead box C2; G-CSF, colony stimulating factor 3 
(granulocyte); galanin; gastric inhibitory polypeptide; ghrelin; glucagon receptor; glucagon- 
like peptide- 1, GLP-1; glucokinase; glycogen synthase kinase-3B; glycogen synthase 
kinase-3A; human phosphotryosyl-protein phosphatase (PTP-IB); HcB kinase; inositol 
pholyphosphate phosphatase-like 1; insulin receptor; interleukin 10; leptin; neural cell 
adhesion molecule 1; neuron growth associated protein 43; peroxisome proliferator- 
activated receptor-gamma; phas; EIF4BP; protein kinase C, gamma; resistin; and 
uncoupling protein 2. 

[0305] In autoimmune disorders, the immune system triggers an inmiune response 
when there are no foreign substances to fight and the body's nonnally protective immune 
system causes damage to its own tissues by mistakenly attacking self There are many 
different autoimmune disorders which affect the body in different ways. For example, the 
brain is affected in individuals with multiple sclerosis, the gut is affected in individuals with 
Crohn's disease, and the synovium, bone and cartilage of various joints are affected in 
individuals with rheumatoid arthritis. As autoinmiune disorders progress destruction of one 
or more types of body tissues, abnormal growth of an organ, or changes in organ function 
may result. The autoimmune disorder may affect only one organ or tissue type or may 
affect multiple organs and tissues. Organs and tissues commonly affected by autoimmune 
disorders include red blood cells, blood vessels, connective tissues, endocrine glands (e.g., 
the thyroid or pancreas), muscles, joints, and skin. Examples of autoimmune disorders that 
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can'be prevented, treated or ameliorated by the methods of the invention include, but are 
not limited to, alopecia areata, ankylosing spondylitis, antiphospholipid syndrome, 
autoimmune Addison's disease, autoimmune diseases of the adrenal gland, autoimmune 
hemolytic anemia, autoimmune hepatitis, autoimmune oophoritis and orchitis, autoimmune 
thrombocytopenia, Behcet's disease, bullous pemphigoid, cardiomyopathy, celiac sprue- 
dermatitis, chronic fatigue immune dysfunction syndrome (CFIDS), chronic inflammatory 
demyelinating polyneuropathy, Churg-Strauss syndrome, cicatrical pemphigoid, CREST 
syndrome, cold agglutinin disease, Crohn's disease, discoid lupus, essential mixed 
cryoglobulinemia, fibromyalgia-fibromyositis, glomerulonephritis, Graves' disease, 
Guillain-Barre, Hashimoto's thyroiditis, idiopathic puhnonary fibrosis, idiopathic 
thrombocytopenia purpura (ETP), IgA neuropathy, juvenile arthritis, lichen planus, lupus 
erthematosus, Meniere's disease, mixed connective tissue disease, muhiple sclerosis, type 1 
or immune-mediated diabetes mellitus, myasthenia gravis, pemphigus vulgaris, pernicious 
anemia, polyarteritis nodosa, polychrondritis, polyglandular syndromes, polymyalgia 
rheumatica, polymyositis and dermatomyositis, primary agammaglobulinemia, primary 
biliary cirrhosis, psoriasis, psoriatic arthritis, Raynauld's phenomenon, Reiter's syndrome, 
Rheumatoid arthritis, sarcoidosis, scleroderma, Sjogren's syndrome, stiff-man syndrome, 
systemic lupus er34hematosus, lupus erythematosus, takayasu arteritis, temporal 
arteristis/giant cell arteritis, ulcerative colitis, uveitis, vasculitides such as dermatitis 
herpetiformis vasculitis, vitihgo, and Wegener's granulomatosis. 
[0306] Autoimmune therapies and their dosages, routes of administration and 
recommended usage are Icnown in the art and have been described in such literature as the 
Physician 's Desk Reference (56"" ed., 2002). 

5.9.4. Genetic Disorders 

[0307] A compound identified in accordance with the methods of the invention may be 

administered to a subject in need thereof to prevent, treat or ameliorate a genetic disorder of 

one or more symptoms thereof. A compoimd identified in accordance with the methods of 

may also be administered in combination with one or more other therapies (e.g., 

prophylactic or therapeutic agents) to a subject in need thereof to prevent, treat or 

ameliorate a genetic disorder or one or more symptoms thereof. Preferably, such therapies 

are usefiil for the prevention or treatment of a genetic disorder. In a specific embodiment, 

the invention provides a method of preventing, treating or ameliorating a genetic disorder or 

one or more symptoms thereof, said method comprising administering to a subject in need 

thereof a dose of a prophylactically or therapeutically effective amount of a compound 

identified in accordance with the methods of the invention. In another embodiment, the 
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mvention provides a method of preventing, treating or ameliorating a genetic disorder or 
one or more symptoms thereof, said method comprising administering to a subject in need 
thereof a dose of aprophylactically or therapeutically effective amount of a compound 
identified in accordance with the methods of the mvention and a dose of a prophylactically 
or therapeutically effective amount of one or more other therapies (e.g., prophylactic or 
therapeutic agents). 

[0308] In this embodiment, taiget genes encoding proteins include, but are not limited 
to, NAD(P)-dependent steroid dehydrogenase (EC 1.1.1. -hi 05e3 protein); peroxisome 
biogenesis factor 1 (peroxin-1) (peroxisome biogenesis disorder protein 1); andutiophin. 
[0309] Examples of genetic disorders which can be prevented or tieated in accordance 
with the invention include, but are not lunited to, alopecia areata, alpha- 1 -antitrypsin 
deficiency, ataxia. Fragile X Syndrome, Gaucher disease. Hemophilia, Huntington disease, 
Niemaim-Pick disease. Retinitis Pigmentosa, SCID (Severe Combined Immunodeficiency), 
Thalassemia, and Xeroderma Pigmentosum. 

5.9.5. Viral Infections 

1031 0] A compound identified in accordance with the methods of the invention may be 

administered to a subject in need thereof to prevent, treat or ameliorate a viral infection or 

one or more conditions or symptoms associated therewith. A compound identified in 

accordance with the methods of the invention may also be administered in combination 

with one or more other ther^ies (e.g., prophylactic or therapeutic agents) to a subject in 

need thereof to prevent, treat or ameliorate a viral infection or one or more conditions or 

symptoms associated therewith. Preferably, such therapies are usefiil for the prevention or 

treatment of a viral infection. Examples of such therapies include, but are not limited to, 

amantadme, ribavirin, rimantadine, acyclovir, famciclovir, foscamet, ganciclovur, 

tiifluridine, vidarabine, didanosine, stavudine, zalciltabine, zidovudine, interferon, an 

antibiotic, amantadine, ribavirin, rimantadme, acyclovir, famciclovir, foscamet, ganciclovir, 

tirifluridine, vidarabine, didanosine, stavudme, zalciltabine, zidovudine, interferon, an 

antibiotic, PR0542 (Progenies) which is a CD4 fiision antibody useful for the hreatinent of 

HIV infection, Ostavir (Protein Design Labs, Inc., CA) which is a human antibody usefiil 

for the tieatment of hepatitis B virus, and Protovir (Protein Design Labs, hic, CA) which is 

a humanized IgGl antibody usefixl for the treatinent of cytomegalovirus (CMV). In a 

specific embodiment, the invention provides a method of preventing, treating or 

amehorating a viral infection or one or more symptoms thereof, said method comprising 

administering to a subj ect in need tiiereof a dose of a prophylactically or therapeutically 

effective amount of a compound identified in accordance witii the methods of the invention 
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In another embodiment, the invention provides a method of preventing, treating or 
ameliorating a viral infection or one or more symptoms thereof, said method comprising 
administering to a subject in need thereof a dose of a prophylactically or therapeutically 
effective amount of a compound identified in accordance with the methods of the invention 
and a dose of a prophylactically or therapeutically effective amount of one or more other 
therapies (e.g., prophylactic or therapeutic agents). 

[0311] The invention provides methods for treating or ameliorating a viral infection or a 
symptom thereof in a subject refractory to conventional therapies for such a viral infection, 
said methods comprising administering to said subject a dose of a prophylactically or 
therapeutically effective amount of a compound identified in accordance with the methods 
of the invention. The invention also provides methods for treating or ameliorating a viral 
infection or a symptom thereof in a subject refractory to existing single agent therapies for 
such a viral infection, said methods comprising administering to said subject a dose of a 
prophylactically or therapeutically effective amount of a compound identified in accordance 
with the methods of the invention and a dose of a prophylactically or therapeutically 
effective amount of one or more other therapies (e.g. , prophylactic or therapeutic agents). 
The invention also provides methods for treating a viral infection by administering a 
compound identified in accordance with the methods of the invention in combination with 
any other therapy to patients who have proven refractory to other therapiess but are no 
longer on these therapies. The invention also provides alternative methods for the treatment 
of a viral infection where another therapy has proven or may prove too toxic, i.e., results in 
unacceptable or unbearable side effects, for the subject being treated. Further, the invention 
provides methods for preventing the recurrence of a viral infection in patients that have 
been treated and have no disease activity by administering a compound identified in 
accordance with the methods of the invention. 

[0312] In this embodiment, target genes encoding proteins include, but are not limited 
to, Clq complement receptor-gClqR; chemokine (C-X3-C) receptor 1; complement decay- 
accelerating factor [Precursor] Synonym CD55 antigen; cyclinTl; desmoglein 1; hepatitis A 
virus cellular receptor 1 -haver- 1; hepatitis B virus X interacting protein-XIP; HIV Tat 
Specific Factor 1; human interferon gamma; human damage specific DNA binding protein- 
DDBl; INIl/hSNF5; interferon alpha-16 precursor (interferon alpha-wa); interferon alpha-5 
precursor (interferon alpha-g) (leif g) (interferon alpha-61). human leukocyte (alpha) 
interferon; interferon alpha-1/13 precursor (interferon alpha-d) (leif d); interferon-beta 1; 
interleukin 8 precursor (il-8) (monoc>1e-derived neutrophilchemotactic factor) (mdnsf) (T- 
cell chemotactic factor) (Neutrophil-activating protein 1) (nap-1) (lymphocyte-derived 

- 118- 



wo 2004/065561 PCT/US2004/001643 

neutropJiii-activating factor) (lynap) (protein 3-lOc) (neutrophil-activating factor) (naf) 
(granulocyte chemotactic protein 1) (gsp-l)-activating factor) (naf) (granulocj^e 
chemotactic protein 1) (gsp-1); interleukin 2; interleiikin-12 beta chain precursor (il-12b) 
(cytotoxic lymphocyte maturation factor 40 kda subunit) (chnf p40) (nk cell stimulatory 
factor chain 2) (nksfZ); natural resistance-associated macrophage protein 1 (nramp 1); 
p300/CBP associated factor (PCAF); poly(rC) binding protein 2; and virion infectivity 
factor. 

[0313] Any type of viral infection can be prevented, treated or ameliorated in 
accordance with the methods of invention. Examples of viruses which cause viral 
infections include, but not limited to, retroviruses (e.g., human T-cell lymphotrophic virus 
(HTLV) types I and II and human immunodeficiency virus (HIV)), herpes viruses (e.g., 
herpes simplex viiTis (HSV) types I and II, Epstein-Barr virus, HHV6-HHV8, and 
cytomegalovirus), arenavirues (e.g., lassa fever virus), paramyxoviruses (e.g., morbillivirus 
virus, human respiratory syncytial vims, mumps, and pneumovirus), adenoviruses, 
bunyaviruses (e.g., hantavirus), comaviruses, filo viruses (e.g., Ebola virus), flaviviruses 
(e.g., hepatitis C virus (HCV), yellow fever virus, and Japanese encephahtis virus), 
hepadnaviruses (e.g., hepatitis B virases (HBV)), orthomyoviruses (e.g., influenza vimses 
A, B and C), papovaviruses (e.g., papillomavirues), picomaviruses (e.g., rliinoviruses, 
enteroviruses and hepatitis A viruses), poxvimses, reoviruses (e.g., rotavirues), togaviruses 
(e.g., rubella virus), rhabdoviruses (e.g., rabies virus). 

[0314] Viral infection therapies and their dosages, routes of administration and 
recommended usage are known in the art and have been described in such literature as the 
Physician 's Desk Reference (56* ed., 2002). 

5.9.6. Fungal Infections 

[03 15] A compound identified in accordance with the methods of the invention may be 

administered to a subject in need thereof to prevent, treat or ameliorate a fungal infection or 

one or more conditions or symptoms associated therewith. A compound identified in 

accordance with the methods of the invention may also be administered in combination 

with one or more other therapies (e.g., prophylactic or therapeutic agents) to a subject in 

need thereof to prevent, treat or amehorate a fimgal infection or one or more conditions or 

sjraptoms associated therewith. Preferably, such therapies are usefiil for the prevention or 

treatment of a fimgal infection. Examples of such therapies include, but are not limited to, 

amphotericin B or analogs or derivatives thereof (including 14(s)-hydroxyamphotericui B 

methyl ester, the hydrazide of amphotericin B with l-amino-4-methylpiperazine, and other 

derivatives) or other polyene macroUde antibiotics (includmg, e.g., nystatin, candicidin, 
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pimaricin and natamycin), flucytosine; flucytosine; griseofulvin; echinocandins or 
aiureobasidins, including naturally occurring and semi-synthetic analogs; 
dihydrobenzo[a]napthacenequinones; nucleoside peptide antifungals including the 
polyoxins and nikkomycins; allylamines such as naflifine and other squalene epoxidease 
inhibitors; azoles, imidazoles and triazoles such as, e.g., clotrimazole, miconazole, 
ketoconazole, econazole, butoconazole, oxiconazole, terconazole, itraconazole or 
fluconazole and the like; rapamycin and rapalogs (non-immunosuppressive derivatives of 
rapamycin); cyclosporin A; FK506; terbinafine; and natural compounds found in goldenseal 
root powder, ipe roxo powder, poke root powder, lavender oil, and tea tree oil. In a specific 
embodiment, tlie invention provides a method of preventing, treating or ameliorating a 
fungal infection or one or more symptoms thereof, said method comprising administering to 
a subject in need thereof a dose of a prophyiactically or therapeutically effective amount of 
a compound identified in accordance with the methods of the invention. In another 
embodiment, the invention provides a method of preventing, treating or amehorating a 
fungal infection or one or more symptoms thereof, said method comprising administering to 
a subject in need thereof a dose of a prophyiactically or therapeutically effective amount of 
a compound identified in accordance with the methods of the invention and a dose of a 
prophyiactically or therapeutically effective amount of one or more other therapies (e.g., 
prophylactic or therapeutic agents). 

[0316] The invention provides methods for treating or ameliorating a fimgal infection or 
a symptom thereof in a subject refractory to conventional therapies for such a fimgal 
infection, said methods comprising administering to said subject a dose of a 
prophyiactically or therapeutically effective amount of a compound identified in accordance 
with the methods of the invention. The invention also provides methods for treating or 
amehorating a fungal infection or a symptom thereof in a subject refractory to existing 
single agent therapies for such a fungal infection, said methods comprising administering to 
said subject a dose of a prophyiactically or therapeutically effective amount of a compound 
identified in accordance with the methods of the invention and a dose of a prophyiactically 
or therapeutically effective amount of one or more therapies (e.g., prophylactic or 
therapeutic agents). The invention also provides methods for treating a fungal infection by 
administering a compound identified in accordance with the methods of the invention in 
combination with any other therapy to patients who have proven refractory to other 
therapies but are no longer on these therapies. The invention also provides alternative 
methods for the treatment of a fungal infection where another therapy has proven or may 
prove too toxic, i.e., results in unacceptable or unbearable side effects, for the subject being 
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ireatea. runner, me mvenuon provides methods for preventing the recurrence of a fimgal 
infection in patients that have been treated and have no disease activity by administering a 
compound identified in accordance with the methods of the invention. 
[0317] hi this embodiment, target genes encoding proteins include, but are not hmited 
to, complement decay-accelerating factor [precursor] synonym CDS 5 antigen; desmoglein 
1; hijman interferon gamma; interferon alpha- 16 precursor (interferon alpha- wa); interferon 
alpha- 1/1 3 precursor (interferon alpha-d) (leif d); interferon alpha-5 precursor (interferon 
alpha-g) (leif g) (interferon alpha-61). human leukocyte (alpha) interferon; interferon-beta 
1; interleukin 2; interleukin-12 beta chain precursor (il-12b) (cytotoxic lymphocyte 
maturation factor 40 kda subunit) (clmf p40) (nlc cell stimulatory factor chain 2) (nksf2); 
interleukin-8 precursor (il-8) (monocyte-derived neutrophilchemotactic factor) (mdnsf) (T- 
cell chemotactic factor) (neutrophil-activating protein 1) (nap-1) (lymphocyte-derived 
neutrophil-activating factor) (lynap) (protein 3- 10c) (neutrophil-activating factor) (naf) 
(granulocyte chemotactic protein 1) (gsp-l);and natural resistance-associated macrophage 
protein 1 (nramp 1). 

[0318] Any type of fungal mfection can be prevented, treated or amehorated in 
accordance with the methods of invention. Examples of fungi which cause fungal 
infections include, but not limited to, Absidia species {e.g., Absidia coiymhifera and 
Absidia ramosa), Aspergillus species, (e.g., Aspergillus flavus, Aspergillus fumigatus, 
Aspergillus nidulans, Aspergillus niger, and Aspergillus terreus), Basidiobolus rmarum, 
Blastomyces dermatitidis, Candida species {e.g., Candida albicans, Candida glabrata, 
Candida ken; Candida krusei, Candida parapsilosis, Candida pseiidotropicalis, Candida 
quillermondii, Candida rugosa, Candida stellatoidea, and Candida tropicalis), 
Coccidioides immitis, Conidiobolus species, Cryptococcus neoforms, Cunninghamella 
species, dermatophytes, Histoplasma capsulatum, Microsporum gypseum, Mucor pusillus, 
Paracoccidioides brasiliensis, Pseudallescheria boydii, Rhinosporidium seeberi, 
Pneumocystis carinii, Rhizopus species {e.g., Rhizopus arrhizus, Rhizopus oryzae, and 
Rhizopus microsporus), Saccharomyces species, Sporothrix schenckii, zygomycetes, and 
classes such as Zygomycetes, Ascomycetes, the Basidiomycetes, Deuteromycetes, and 
Oomycetes. 

[0319] Fungal infection therapies and their dosages, routes of administration and 
recommended usage are known in the art and have been described in such Uterature as the 
Physician 's Desk Reference {56^ ed., 2002). 

5.9.7. Bacterial Infections 
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[UJ21IJ A compound identitied in accordance with the methods of the invention may be 
administered to a subject in need thereof to prevent, treat or ameliorate a bacterial infection 
or one or more conditions or symptoms thereof A compound identified in accordance with 
the methods of the invention may also be administered in combination with one or more 
other therapies {e.g., prophylactic or therapeutic agents) to a subject in need thereof to 
prevent, treat or ameliorate a bacterial infection or one or more conditions or symptoms 
thereof. Preferably, such therapies are useful for the prevention or treatment of abacterial 
infection. Examples of such therapies include, but are not limited to, amoxycillin, 
bacteriophages, chloramphenicol, chlorhexidine, co-trimoxazole, fluoroquinolones {e.g., 
ciprofloxacin and ofloxacin), isoniazid, macrolides, oxazolidinones, penicillin, quinolones, 
rifampicin, rifamycins, streptomycin, sulfonamides, and tetracyclines. In a specific 
embodiment, the invention provides a method of preventing, treating or ameliorating a 
bacterial infection or one or more symptoms thereof, said method comprising administering 
to a subject in need thereof a dose of a prophylactically or therapeutically effective amount 
of a compound identified in accordance with the methods of the invention. In another 
embodiment, the invention provides a method of preventing, treating or ameliorating a 
bacterial infection or one or more symptoms thereof, said method comprising administering 
to a subject in need thereof a dose of a prophylactically or therapeutically effective amount 
of a compoimd identified in accordance with the methods of the invention and a dose of a 
prophylactically or therapeutically effective amoxmt of one or more other therapies {e.g., 
prophylactic or therapeutic agents). 

[0321 ] The invention provides methods for treating or ameliorating a bacterial infection 
or a symptom thereof in a subject refi-actory to conventional therapies for such a bacterial 
infection, said methods comprising administering to said subject a dose of a 
prophylactically or therapeutically effective amount of a compound identified in accordance 
with the methods of the invention. The invention also provides methods for treating or 
ameliorating a bacterial infection or a symptom thereof in a subject refiractory to existing 
single agent therapies for such a bacterial infection, said methods comprising administering 
to said subject a dose of a prophylactically or therapeutically effective amount of a 
compound identified in accordance with the methods of the invention and a dose of a 
prophylactically or therapeutically effective amount of one or more other therapies {e.g., 
prophylactic or therapeutic agents). The invention also provides methods for treating a 
bacterial infection by administering a compound identified in accordance with the methods 
of the invention in combination with any other therapy to patients who have proven 
refiractory to other therapies but are no longer on these therapies. The invention also 
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proviaes aiiemanve memoas ror the treatment of a bacterial infection where another therapy 
has proven or may prove too toxic, i.e., results in unacceptable or unbearable side effects, 
for the subject being treated. Further, the invention provides methods for preventing the 
recurrence of a bacterial infection in patients that have been treated and have no disease 
activity by administering a compound identified in accordance with the methods of the 
invention. 

[0322] hi this embodiment, target genes encoding proteins include, but are not limited 
to, ADP-ribosylation factor-4; bactericidal/permeability-increasmg protein; complement 
decay- accelerating factor [precursor] synonym CDS 5 antigen; desmoglein 1; human 
interferon gamma; interferon alpha- 16 precursor (interferon alpha- wa); interferon alpha- 
1/13 precursor (interferon alpha-d) (leif d); interferon alpha-5 precursor (interferon alpha-g) 
(leif g) (interferon alpha-61). himian leukocyte (alpha) interferon; interferon-beta 1; 
interleuldn 2; interleukin-12 beta chain precursor (il-12b) (cytotoxic lymphocyte maturation 
factor 40 kda subunit) (clmf p40) (nk cell stimulatoiy factor chain 2) (nks£2); interleukin-8 
precursor (il-8) (monocyte-derived neutrophilchcmotactic factor) (mdnsf) (T-cell 
chemotactic factor) (neutropliil-activating protein 1) (nap-1) (lyinphocyte-derived 
neutrophil-activating factor) (lynap) (protein 3- 10c) (neutrophil-activating factor) (naf) 
(granulocyte chemotactic protein 1) (gsp-l);and natural resistance-associated macrophage 
protein 1 (nramp 1), 

[0323] Any type of bacterial infection can be prevented, treated or ameliorated in 
accordance with the methods of invention. Examples of bacteria which cause bacterial 
infections include, but not limited to, the Aquaspirillimi family, Azospirillum family, 
Azotobacteraceae family, Bacteroidaceae family, Bartonella species, Bdellovibrio family, 
Campylobacter species, Chlamydia species {e.g., Chlamydia pneumoniae), Clostridium, 
Enterobacteriaceae family {e.g., Citrobacter species, Edwardsiella, Enterobacter 
aerogenes, Erwinia species, Escherichia coli, Hafnia species, Klebsiella species, 
Morganella species, Proteus vulgaris, Providencia, Salmonella species, Serratia 
marcescens, and Shigella flexneri), Gardinella family, Haemophilus influenzae, 
Halobacteriaceae family, Helicobacter family, Legionallaceae family, Listeria species, 
Methylococcaceae family, mycobacteria {e.g., Mycobacterium tuberculosis), Neisseriaceae 
family, Oceanospirillum family, Pasteur ellaceae family, Pneumococcus species, 
Pseudomonas species, Rhizobiaceae family. Spirillum family, Spirosomaceae family, 
Staphylococcus {e.g., methicillin resistant Staphylococcus aureus and Staphylococcus 
pyrogenes). Streptococcus {e.g., Streptococcus enteritidis, Streptococcus fasciae, and 
Streptococcus pneumoniae)VampirovibrHeUcobacter family, and Vampirovibrio family. 
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IU3Z4J ±5actenal mtection merapies and their dosages, routes of administration and 
recommended usage are known in the art and have been described in such literature as the 
Physician 's Desk Reference (56* ed., 2002). 

5.9.8. Cardiovascnlar Diseases 

[0325] A compound identified in accordance with the methods of the invention may be 

administered to a subject in need thereof to prevent, treat or ameliorate a cardiovascular 

disease or one or more symptoms thereof. A compound identified in accordance with the 

methods of the invention may also be administered in combination with one or more other 

therapies {e.g., prophylactic or therapeutic agents) to a subject in need thereof to prevent, 

treat or ameliorate a cardiovascular disease or one or more symptoms thereof. Preferably, 

such therapies are usefiil for the prevention or treatment of a cardiovascular disease. 

Examples of such therapies include, but are not limited to, peripheral antiadrenergic drugs, 

centrally acting antihypertensive drugs {e.g., methyldopa and methyldopa HCl), 

antihypertensive direct vasodilators (e.g., diazoxide and hydralazine HCl), drugs affecting 

renin-angiotensin system, peripheral vasodilators, phentolamine, antianginal drugs, cardiac 

glycosides, inodilators (e.g., amrinone, milrinone, enoximone, fenoximone, imazodan, and 

sulmazole), antidysrhythmic drugs, calcium entry blockers, ranitine, bosentan, and rezulin. 

In a specific embodiment, the invention provides a method of preventing, treating or 

ameliorating a cardiovascular disease or one or more symptoms thereof, said method 

comprising administering to a subject in need thereof a dose of a prophylactically or 

therapeutically effective amount of a compound identified in accordance with the methods 

of the invention. In another embodiment, the invention provides a method of preventing, 

treating or ameliorating a cardiovascular disease or one or more symptoms thereof, said 

method comprising administering to a subject in need thereof a dose of a prophylactically or 

therapeutically effective amount of a compound identified in accordance with the methods 

of the invention and a dose of a prophylactically or therapeutically effective amount of one 

or more other therapies {e.g., prophylactic or therapeutic agents). 

[0326] The invention provides methods for treating or ameliorating one or more 

symptoms of a cardiovascular disease in a subject refiractory to conventional therapies for 

such a cardiovascular disease, said methods comprising administering to said subject a dose 

of a prophylactically or therapeutically effective amount of a compound identified in 

accordance with the methods of the invention. The invention also provides methods for 

treating or ameliorating one or more symptoms of a cardiovascular disease in a subject 

refiractory to existing single agent therapies for such a cardiovascular disease, said methods 

comprising administering to said subject a dose of a prophylactically or therapeutically 
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ellective amount oi a compound identified, in accordance with the methods of the invention 
and a dose of a prophylactically or therapeutically effective amount of one or more other 
therapies {e.g., prophylactic or therapeutic agents). The invention also provides methods 
for treating a cardiovascular disease by administering a compound identified in accordance 
with the methods of the invention in combination with any other therapy to patients who 
have proven refractory to other therapies but are no longer on these therapies. The 
invention also provides alternative methods for the treatment of a cardiovascular disease 
where another therapy has proven or may prove too toxic, i.e., results in unacceptable or 
unbearable side effects, for the subject being treated. Further, the invention provides 
methods for preventing the recurrence of a cardiovascular disease in patients that have been 
treated and have no disease activity by administering a compound identified in accordance 
with the methods of the invention. 

[0327] In this embodiment, target genes encoding proteins include, but are not lixnited 
to, 3-hydroxy-3-methylglutaryl-CoA reductase; actin, alpha cardiac; acyl-coa 
dehydrogenase; angiotensin 1-converting enzyme; bile salt export pump (atp-binding 
cassette, sub-family b, member 1 1); cardiac muscle troponin T; carnitine o- 
palmitoyltransferase; emotakin ATP-binding cassette, sub-family a, member 1 (atp-binding 
cassette transporter 1) (atp-binding cassette 1) (abc-1) (cholesterol efflux regulatory 
protein); erythropoietin; fibiillm; human trisosephosphate isomerase; iduronate 2-sulfatase; 
klotho; and thrombomodulin. 

[0328] Any cardiovascular disease can be prevented, treated or ameHorated in 
accordance with the methods of the invention. Examples of cardiovascular diseases 
include, but not limited to, athlerosclerosis, stroke, cerebral infarction, endothelium 
dysfunctions (in particular, those dysfunctions affecting blood vessel elasticity) ischemic 
heart disease {e.g., angina pectoris, myocardial infarction, and chronic ischemic heart 
disease), hypertensive heart disease, pulmonary heart disease, coronary heart disease, 
valvular heart disease {e.g., rheumatic fever and rheumatic heart disease, endocarditis, 
mitral valve prolapse, restenosis and aortic valve stenosis), congenital heart disease {e.g., 
valvular and vascular obstructive lesions, atrial or ventricular septal defect, and patent 
ductus arteriosus), and myocardial disease {e.g., myocarditis, congestive cardiomyopathy, 
and hypertrophic cariomyopathy). 

[0329] Cardiovascular disease therapies and their dosages, routes of administration and 
recommended usage are known in the art and have been described in such literature as the 
Physician 's Desk Reference (56* ed., 2002). 



5.9.9. Central Nervous System Disorders 
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[U33UJ A compotina identitied in accordance with the methods of the invention may be 
administered to a subject in need thereof to prevent, treat or ameliorate a central nervous 
system ("CNS") disorder or one or more symptoms thereof. A compound identified in 
accordance with the methods of the invention may also be administered in combination 
with one or more other therapies (e.g., prophylactic or therapeutic agents) to a subject in 
need thereof to prevent, treat or ameliorate a CNS disorder or one or more symptoms 
thereof Preferably, such therapies are useful for the prevention or treatment of a CNS 
disorder. Examples of such therapies include, but are not limited to, levodopa, Parlodel 
(bromocriptine), Pemiax (pergohde), Eldepryl (selegiline hydrochloride), donepezil 
(Aricept®), tacrine (Cognex®), acyclovir, antibiotics, chemotherapeutics, and radiation 
therapy . hi a specific embodiment, the invention provides a method of preventing, treating 
or ameliorating a CNS disorder or one or more symptoms thereof, said method comprising 
administering to a subject in need thereof a dose of a prophylactically or therapeutically 
effective amount of a compound identified in accordance with the methods of the invention. 
In another embodiment, the invention provides a method of preventing, treating or 
amehorating a CNS disorder or one or more symptoms thereof, said method comprising 
administering to a subject in need thereof a dose of a prophylactically or therapeutically 
effective amount of a compound identified m accordance with the methods of the invention 
and a dose of a prophylactically or therapeutically effective amount of one or more other 
therapies (e.g., prophylactic or therapeutic agents). 

[0331] The invention provides methods for treating or ameliorating one or more 
symptoms of a CNS disorder in a subject refractory to conventional therapies for such a 
cardiovascular disease, said methods comprising administering to said subject a dose of a 
prophylactically or therapeutically effective amount of a compound identified in accordance 
with the methods of the invention. The invention also provides methods for treating or 
ameliorating one or more symptoms of a CNS disorder in a subject refractory to existing 
single agent therapies for such a CNS disorder, said methods comprising administering to 
said subject a dose of a prophylactically or therapeutically effective amount of a compound 
identified in accordance with the methods of the invention and a dose of a prophylactically 
or therapeutically effective amount of one or more other therapies (e.g., prophylactic or 
therapeutic agents). The invention also provides methods for treating a CNS disorder by 
administering a compound identified in accordance with the methods of the invention in 
combination with any other therapy to patients who have proven refiractory to other 
therapies but are no longer on these therapies. The invention also provides alternative 
methods for the treatment of a CNS disorder where another therapy has proven or may 
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prove too toxic, i.e., results m unacceptable or unbearable side effects, for the subject being 
treated. Further, the invention provides methods for preventing the recurrence of a 
cardiovascular disease in patients that have been treated and have no disease activity by 
administering a compound identified in accordance with the methods of the invention. 
[0332] hi this embodiment, target genes encoding proteins include, but are not limited 
to, acetylcholinesterase; Alzheimer's disease amyloid A4 [precursor], synonyms: protease 
nexin-II PN-II APPI; beta-site APP-cleaving enzyme 2; catechol-O-methyltransferase; 
CREAM/calsenilin/KCh IPS; D-amino-acid oxidase; drebrin-1 dendritic spine protein; 
glutamic acid decarboxylase 2; glutamic acid decarboxylase, brain, membrane form; 
glutamic acid decarboxylase 3; human D-1 dopamine receptor; huntingtin; kallikrein 6; 
monoamine oxidase- A; monoamine oxidase-B; N-methylD-aspartate (NMD A) receptor; 
peroxisome assembly factor-2 (paf-2) (peroxisomal-type atpase 1) (peroxin-6); and 
vanilloid receptor subunit 1 (capsaicin receptor). 

[0333] Any CNS disorder can be treated in accordance with the methods of the 
invention. Examples of CNS disorders include, but are not limited to, bacterial and viral 
meningitis, Alzheimers Disease, cerebral toxoplasmosis, Parkinson's disease, multiple 
sclerosis, brain cancers {e.g., metastatic carcinoma of the brain, glioblastoma, astrocytoma, 
and acoustic neuroma), hydrocephalus, and encephalitis. 

[0334] CNS disorder therapies and their dosages, routes of administration and 
recommended usage are known in the art and have been described in such hterature as the 
Physician 's Desk Reference (56* ed., 2002). 

5.10. Classification of UTRs, Compound, and Disease State 

[0335] The names of the compounds identified in accordance with the methods 

described herein and the names of the genes whose expression are modulated in response to 

said compounds can be maintained in a database. By performing an assay for modulators of 

untranslated region-dependent gene expression using the same assay format for a group of 

untranslated regions of target genes {i.e., an assay format in which the only variable 

between each individual assay is the nucleotide sequence of untranslated regions operably 

linked to a reporter gene) and by storing all relevant data in a database, cluster analysis can 

be performed on the data and ftinctional associations between and/or within relevant data 

sets {e.g., (a) compounds, (b) nucleotide sequences of untranslated regions of genes, and (c) 

pathological conditions associated with genes) can be determined. 

[0336] For example, if a common set of compounds modulates the expression of a 

reporter gene when the latter is operably linked to a set of untranslated regions from all 

untranslated regions analyzed, it can be concluded that the untranslated regions in the set 

- 127 - 



wo 2004/065561 PCT/US2004/001643 

are involved in a common process or processes in post-transcriptional control of gene 
expression. If functional involvement in post-transcriptional control of gene expression has 
been reported for any of the untranslated regions in the set, the untranslated regions without 
known roles in gene expression regulation can be assigned a function. By performing 
further analysis and looking for sets of compounds that modulate sets of untranslated 
regions common to a particular pathological condition, the following can be identified: (a) 
members of biochemical reaction pathways involved in the disease, (b) targets for multiple 
drug intervention and/or regulation, and (c) multiple pathological conditions that can be 
treated with a single compound or set of compounds. 

6. EXAMPLE: THERAPEUTIC UNTRANSLATED REGION TARGETS 
[0337] The therapeutic targets presented herein are by way of example, and the present 
invention is not to be limited by the targets described herein. The therapeutic targets 
presented herein as DNA sequences are understood by one of skill in the art that the 
sequences can be converted to RNA sequences. 

6.1. Tumor Necrosis Factor Alpha 
[0338] See, e.g. , GenBank Accession # XO 1 394. 

General Target Regions: 

[0339] (1) 5' Untranslated Region - nts 1 - 152 of GenBank Accession # X01394: 
gcagaggacc agctaagagg gagagaagca actacagacc ccccctgaaa acaaccctca gacgccacat 
cccctgacaa gctgccaggc aggttctctt cctctcacat actgacccac ggctccaccc tctctcccct ggaaaggaca cc 

(SEQ ID NO: 5) 

[0340] (2) 3 ' Untranslated Region - nts 852 - 1 643 of GenBank Accession # 
X01394: 

tgaggagga cgaacatcca accttcccaa acgcctcccc tgccccaatc cctttattac cccctccttc agacaccctc 
aacctcttct ggctcaaaaa gagaattggg ggcttagggt cggaacccaa gcttagaact ttaagcaaca agaccaccac 
ttcgaaacct gggattcagg aatgtgtggc ctgcacagtg aattgctggc aaccactaag aattcaaact ggggcctcca 
gaactcactg gggcctacag ctttgatccc tgacatctgg aatctggaga ccagggagcc tttggttctg gccagaatgc 
tgcaggactt gagaagacct cacctagaaa ttgacacaag tggaccttag gccttcctct ctccagatgt ttccagactt 
ccttgagaca cggagcccag ccctccccat ggagccagct ccctctattt atgtttgcac ttgtgattat ttattattta tttattattt 
atttatttac agatgaatgt atttatttgg gagaccgggg tatcctgggg gacccaatgt aggagctgcc ttggctcaga 
catgttttcc gtgaaaacgg agctgaacaa taggctgttc ccatgtagcc ccctggcctc tgtgccttct tttgattatg 
ttttttaaaa tatttatctg attaagttgt ctaaacaatg ctgatttggt gaccaactgt cactcattgc tgagcctctg 
ctccccaggg gagttgtgtc tgtaatcgcc ctactattca gtggcgagaa ataaagtttg ctt (SEQ ID NO: 6) 

Initial Specific Target Motif: 

- 128- 



wo 2004/065561 PCT/US2004/001643 

tUJ4i J {5) Group I AU-Rich Element (ARE) Cluster in 3 ' untranslated region: 
5' AUUUAUUUAUUUAUUUAUUUA 3' (SEQ ID NO: 7) 



6.2. Granulocyte-macrophage Colony Stimulating Factor 
[0342] See, e.g. , GenBank Accession # NM__000758 or # XM_00375 1 . 
General Target Regions: 

[0343] (1) 5' Untranslated Region - nts 1 - 32 of GenBank Accession # 
NM_000758: 

g/tctggaggat gtggctgcag agcctgctgc tcttgggcac (SEQ ID NO: 8) 

[0344] (2) 3 ' Untranslated Region - nts 468 - 789 of GenBank Accession # 

NM_000758: 

gcc ggggagctgc tctctcatga aacaagagct agaaactcag gatggtcatc ttggagggac caaggggtgg 
gccacagcca tggtgggagt ggcctggacc tgccctgggc cacactgacc ctgatacagg catggcagaa 
gaatgggaat attttatact gacagaaatc agtaatattt atatatttat atttttaaaa tatttattta tttatttatt taagttcata 
ttccatattt attcaagatg ttttaccgta ataattatta ttaaaaatat gcttct (SEQ ID NO: 9) 
[0345] Initial Specific Target Motif: 

[0346] Group I AU-Rich Element (ARE) Cluster in 3 ' untranslated region: 
5' AUUUAUUUAUUUAUUUAUUUA 3' (SEQ ID NO: 10) 

6.3. Interleukin 2 

[0347] See, e.g., GenBank Accession # U25676. 

General Target Regions: 

[0348] (1) 5' Untranslated Region - nts 1 - 47 of GenBank Accession # U25676: 
atcactctct ttaatcacta ctcacattaa cctcaactcc tgccaca (SEQ ID NO: 1 1) 
[0349] (2) 3 ' Untranslated Region - nts 519- 825 of GenBank Accession # U25676: 
ta attaagtgct tcccacttaa aacatatcag gccttctatt tatttattta aatatttaaa ttttatattt attgttgaat gtatggttgc 
tacctattgt aactattatt cttaatctta aaactataaa tatggatctt ttatgattct ttttgtaagc cctaggggct ctaaaatggt 
ttaccttatt tatcccaaaa atatttatta ttatgttgaa tgttaaatat agtatctatg tagattggtt agtaaaacta tttaataaat 
ttgataaata taaaaaaaaa aaacaaaaaa aaaaa (SEQ ID NO: 12) 
[0350] Initial Specific Target Motifs: 

Group III AU-Rich Element (ARE) Cluster in 3' untranslated region: 
5' NAUUUAUUUAUUUAN 3' (SEQ ID NO: 13) 

6.4. Interleukin 6 

[0351] See, e.g. , GenBank Accession # NM_000600. 
General Target Regions: 
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[0352] (1) 5' Untranslated Region - nts 1 - 62 of GenBaok Accession # 
NM_000600: 

ttctgccctc gagcccaccg ggaacgaaag agaagctcta tctcgcctcc aggagcccag ct (SEQ ID NO: 14) 
[0353] (2) 3' Untranslated Region - nts 699 - 1125 of GenBank Accession # 

NM_000600: 

ta gcatgggcac ctcagattgt tgttgttaat gggcattcct tcttctggtc agaaacctgt ccactgggca cagaacttat 
gttgttctct atggagaact aaaagtatga gcgttaggac actattttaa ttatttttaa tttattaata tttaaatatg tgaagctgag 
ttaatttatg taagtcatat ttatattttt aagaagtacc acttgaaaca ttttatgtat tagttttgaa ataataatgg aaagtggcta 
tgcagtttga atatcctttg tttcagagcc agatcatttc ttggaaagtg taggcttacc tcaaataaat ggctaactta 
tacatatttt taaagaaata tttatattgt atttatataa tgtataaatg gtttttatac caataaatgg cattttaaaa aattc (SEQ 
ID NO: 15) 

[0354] Initial Specific Target Motifs: 

Group III AU-Rich Element (ARE) Cluster in 3' untranslated region: 
5' NAUUUAUUUAUUUAN 3' (SEQ ID NO: 16) 

6,5. Vascular Endothelial Growth Factor 
[0355] See, e.g. , GenBank Accession # AF022375. 
General Target Regions: 

[0356] (1) 5' Untranslated Region - nts 1 - 701 of GenBank Accession # AF022375: 
aagagctcca gagagaagtc gaggaagaga gagacggggt cagagagagc gcgcgggcgt gcgagcagcg 
aaagcgacag gggcaaagtg agtgacctgc ttttgggggt gaccgccgga gcgcggcgtg agccctcccc 
cttgggatcc cgcagctgac cagtcgcgct gacggacaga cagacagaca ccgcccccag ccccagttac cacctcctcc 
ccggccggcg gcggacagtg gacgcggcgg cgagccgcgg gcaggggccg gagcccgccc ccggaggcgg 
ggtggagggg gtcggagctc gcggcgtcgc actgaaactt ttcgtccaac ttctgggctg ttctcgcttc ggaggagccg 
tggtccgcgc gggggaagcc gagccgagcg gagccgcgag aagtgctagc tcgggccggg aggagccgca 
gccggaggag ggggaggagg aagaagagaa ggaagaggag agggggccgc agtggcgact cggcgctcgg 
aagccgggct catggacggg tgaggcggcg gtgtgcgcag acagtgctcc agcgcgcgcg ctccccagcc 
ctggcccggc ctcgggccgg gaggaagagt agctcgccga ggcgccgagg agagcgggcc gccccacagc 
ccgagccgga gagggacgcg agccgcgcgc cccggtcggg cctccgaaac c (SEQ ID NO: 17) 
[0357] (2) 3' Untranslated Region - nts 1275 - 3166 of GenBank Accession # 
AP022375: 

tgagcc gggcaggagg aaggagcctc cctcagggtt tcgggaacca gatctctctc caggaaagac tgatacagaa 

cgatcgatac agaaaccacg ctgccgccac cacaccatca ccatcgacag aacagtcctt aatccagaaa cctgaaatga 

aggaagagga gactctgcgc agagcacttt gggtccggag ggcgagactc cggcggaagc attcccgggc 

gggtgaccca gcacggtccc tcttggaatt ggattcgcca ttttattttt cttgctgcta aatcaccgag cccggaagat 

tagagagttt tatttctggg attcctgtag acacacccac ccacatacat acatttatat atatatatat tatatatata 
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taaaaataaa tatctctatt ttatatatat aaaatatata tattcttttt ttaaattaac agtgctaatg ttattggtgt cttcactgga 
tgtatttgac tgctgtggac ttgagttggg aggggaatgt tcccactcag atcctgacag ggaagaggag gagatgagag 
actctggcat gatctttttt ttgtcccact tggtggggcc agggtcctct cccctgccca agaatgtgca aggccagggc 
atgggggcaa atatgaccca gttttgggaa caccgacaaa cccagccctg gcgctgagcc tctctacccc aggtcagacg 
gacagaaaga caaatcacag gttccgggat gaggacaccg gctctgacca ggagtttggg gagcttcagg acattgctgt 
gctttgggga ttccctccac atgctgcacg cgcatctcgc ccccaggggc actgcctgga agattcagga gcctgggcgg 
ccttcgctta ctctcacctg cttctgagtt gcccaggagg ccactggcag atgtcccggc gaagagaaga gacacattgt 
tggaagaagc agcccatgac agcgcccctt cctgggactc gccctcatcc tcttcctgct ccccttcctg gggtgcagcc 
taaaaggacc tatgtcctca caccattgaa accactagtt ctgtcccccc aggaaacctg gttgtgtgtg tgtgagtggt 
tgaccttcct ccatcccctg gtccttccct tcccttcccg aggcacagag agacagggca ggatccacgt gcccattgtg 
gaggcagaga aaagagaaag tgttttatat acggtactta tttaatatcc ctttttaatt agaaattaga acagttaatt 
taattaaaga gtagggtttt ttttcagtat tcttggttaa tatttaattt caactattta tgagatgtat cttttgctct ctcttgctct 
cttatttgta ccggtttttg tatataaaat tcatgtttcc aatctctctc tccctgatcg gtgacagtca ctagcttatc ttgaacagat 
atttaatttt gctaacactc agctctgccc tccccgatcc cctggctccc cagcacacat tcctttgaaa gagggtttca 
atatacatct acatactata tatatattgg gcaacttgta tttgtgtgta tatatatata tatatgttta tgtatatatg tgatcctgaa 
aaaataaaca tcgctattct gttttttata tgttcaaacc aaacaagaaa aaatagagaa ttctacatac taaatctctc 
tcctttttta attttaatat ttgttatcat ttatttattg gtgctactgt ttatccgtaa taattgtggg gaaaagatat taacatcacg 
tctttgtctc tagtgcagtt tttcgagata ttccgtagta catatttatt tttaaacaac gacaaagaaa tacagatata 
tcttaaaaaa aaaaaa (SEQ ID NO: 1 8) 
[0358] Initial Specific Target Motifs: 

[0359] (1) Internal Ribosome Entry Site (IRES) in 5 ' vmtranslated region nts 5 1 3 - 
704: 

5'CCGGGCUCAUGGACGGGUGAGGCGGCGGUGUGCGCAGACAGUGCUCCAGCG 

CGCGCGCUCCCCAGCCCUGGCCCGGCCUCGGGCCGGGAGGAAGAGUAGCUCG 

CCGAGGCGCCGAGGAGAGCGGGCCGCCCCACAGCCCGAGCCGGAGAGGGACG 

CGAGCCGCGCGCCCCGGUCGGGCCUCCGAAACCAUGAACUUUCUGCUGUCUU 

GGGUGCAUUGGAGCCUUGCCUUGCUGCUCUACCUCCACCAUG 3' (SEQ ID NO: 

19) 

[0360] (2) Group III AU-Rich Element (ARE) Cluster in 3 ' untranslated region: 
5' NAUUUAUUUAUUUAN 3' (SEQ ID NO: 20) 

6.6. Survivin 
[0361] See, e.g., GenBank Accession # NM_001168. 
General Target Regions: 
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[0362] (1) 5' Untranslated Region - nts 1 - 49 of GenBank Accession # 
NM_001168: 

ccgccagatt tgaatcgcgg gacccgttgg cagaggtggc ggcggcggc (SEQ ID NO: 21) 

[0363] (2) 3' Untranslated Region - nts 479 - 1619 of GenBank Accession # 

NM[_001168: 

gg cctctggccg gagctgcctg gtcccagagt ggctgcacca cttccagggt ttattccctg gtgccaccag ccttcctgtg 
ggccccttag caatgtctta ggaaaggaga tcaacatttt caaattagat gtttcaactg tgctcctgtt ttgtcttgaa 
agtggcacca gaggtgcttc tgcctgtgca gcgggtgctg ctggtaacag tggctgcttc tctctctctc tctctttttt 
gggggctcat ttttgctgtt ttgattcccg ggcttaccag gtgagaagtg agggaggaag aaggcagtgt cccttttgct 
agagctgaca gctttgttcg cgtgggcaga gccttccaca gtgaatgtgt ctggacctca tgttgttgag gctgtcacag 
tcctgagtgt ggacttggca ggtgcctgtt gaatctgagc tgcaggttcc ttatctgtca cacctgtgcc tcctcagagg 
acagtttttt tgttgttgtg tttttttgtt tttttttttt ggtagatgca tgacttgtgt gtgatgagag aatggagaca gagtccctgg 
ctcctctact gtttaacaac atggctttct tattttgttt gaattgttaa ttcacagaat agcacaaact acaattaaaa 
ctaagcacaa agccattcta agtcattggg gaaacggggt gaacttcagg tggatgagga gacagaatag agtgatagga 
agcgtctggc agatactcct tttgccactg ctgtgtgatt agacaggccc agtgagccgc ggggcacatg ctggccgctc 
ctccctcaga aaaaggcagt ggcctaaatc ctttttaaat gacttggctc gatgctgtgg gggactggct gggctgctgc 
aggccgtgtg tctgtcagcc caaccttcac atctgtcacg ttctccacac gggggagaga cgcagtccgc ccaggtcccc 
gctttctttg gaggcagcag ctcccgcagg gctgaagtct ggcgtaagat gatggatttg attcgccctc ctccctgtca 
tagagctgca gggtggattg ttacagcttc gctggaaacc tctggaggtc atctcggctg ttcctgagaa ataaaaagcc 
tgtcatttc (SEQ ID NO: 22) 

6.7. Epidermal Growth Factor Receptor 
[0364] See, e.g., GenBank Accession # X00588.1. 
General Target Regions: 

[0365] (1) 5 ' Untranslated Region (247 nt) (GenBank Accession No. Hu EST 
gi|6302071|gb|AW163038.1|AW163038): 

ccccggcgcagcgcggccgcagcagcctccgccccccgcacggtgtgagcgcccgacgcggccgaggcggccggagtccc 
gagctagccccggcggccgccgccgcccagaccggacgacaggccacctcgtcggcgtccgcccgagtccccgcctcgccgc 
caacgccacaaccaccgcgcacggccccctgactccgtccagtattgatcgggagagccggagcgagctcttcggggagcagc 
ag (SEQ ID NO: 23) 

[0366] (2) 3' Untranslated Region (1.7 kb, 58% AT-density): 

tgaccacggaggatagtatgagccctaaaaatccagactctttcgatacccaggaccaagccacagcaggtcctccatcccaacag 

ccatgcccgcattagctcttagacccacagactggttttgcaacgtttacaccgactagccaggaagtacttccacctcgggcacattt 

tgggaagttgcattcctttgtcttcaaactgtgaagcatttacagaaacgcatccagcaagaatattgtccctttgagcagaaatttatctt 

tcaaagaggtatatttgaaaaaaaaaaaaaaagtatatgtgaggatttttattgattggggatcttggagtttttcattgtcgctattgatttt 

tacttcaatgggctcttccaacaaggaagaagcttgctggtagcacttgctaccctgagttcatccaggcccaactgtgagcaagga 
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gcacaagccacaagtcttccagaggatgcttgattccagtggttctgcttcaaggcttccactgcaaaacactaaagatccaagaag 

gccttcatggccccagcaggccggatcggtactgtatcaagtcatggcaggtacagtaggataagccactctgtcccttcctgggca 

aagaagaaacggaggggatgaattcttccttagacttacttttgtaaaaatgtccccacggtacttactccccactgatggaccagtgg 

tttccagtcatgagcgttagactgacttgtttgtcttccattccattgllitgaaactcagtatgccgcccctgtcttgctgtcatg 

caagagaggatgacacatcaaataataactcggattccagcccacattggattcatcagcatttggaccaatagcccacagctgaga 

atgtggaatacctaaggataacaccgcttttgttctcgcaaaaacgtatctcctaatttgaggctcagatgaaatgcatcaggtcctttg 

gggcatagatcagaagactacaaaaatgaagctgctctgaaatctcctttagccatcaccccaaccccccaaaattagtttgtgttactt 

atggaagatagttttctccttttacttcacttcaaaagctttttactcaaagagtatatgttccctccaggtcagctgcccccaaaccccct 

ccttacgctttgtcacacaaaaagtgtctctgccttgagtcatctattcaagcacttacagctctggccacaacagggcattttacaggt 

gcgaatgacagtagcattatgagtagtgtgaattcaggtagtaaatatgaaactagggtttgaaattgataatgctttcacaacatttgca 

gatgttttagaaggaaaaaagttccttcctaaaataatttctctacaattggaagattggaagattcagctagttaggagcccattttttcct 

aatctgtgtgtgccctgtaacctgactggttaacagcagtcctttgtaaacagtgttttaaactctcctagtcaatatccaccccatccaat 

ttatcaaggaagaaatggttcagaaaatattttcagcctacagttatgttcagtcacacacacatacaaaatgttccttttgcttttaaagta 

atttttgactcccagatcagtcagagcccctacagcattgttaagaaagtatttgatttttgtctcaatgaaaataaaactatattcatttcc 

(SEQIDNO:24) 

6.8. CCAAT/Enhancer Binding Protein 
[0367] See, e.g., GenBank Accession # NM_004364. 
General Target Regions: 

[0368] (1) CEBP-oa, uORF (5'UTR, 160 nt): 

Tataaaagctgggccggcgcgggccgggccattcgcgacccggaggtgcgcgggcgcgggcgagcagggtctccgggtgg 
gcggcgcgacgccccgcgcaggctggaggccgccgaggctcgccatgccgggagaactctaactcccccatggagtcggc 
(SEQIDNO: 25) 

[0369] (2) CEBP-0!,uORF(3'UTR, 1306nt): 

tgaggcgcgcggctgtgggaccgccctgggccagcctccggcggggacccagggagtggtttggggtcgccggatctcgagg 

cttgcccagaccgtgcgagccaggactaggagattccggtgcctcctgaaagcctggcctgctccgcgtgtcccctcccttcctctg 

cgccggacttggtgcgtctaagatgagggggccaggcggtggcttctccctgcgaggaggggagaattcttggggctgagctgg 

gagcccggcaactctagtatttaggataacttgtgccttggaaatgcaaactcaccgctccaatgcctactgagtagggggagcaaa 

tcgtgccttgtcattttatttggaggtttcctgcctccttcccgaggctacagcagacccccatgagagaaggaggggagcaggccc 

gtggaggaggggggctcagggagctgagatcccgacaagcccgccagccccagccgctcctccacgcctgtccttagaaaggg 

gtggaaacatagggacttggggcttggaacctaaggttgttccctagttctacatgaaggtggaggtctctagttccacgcctctccca 

cctccctccgcacacaccccacccagcctgctataggctggctttcccttggggctggaactcactgcgatggggtcaccaggtga 

ccagtggagcccccaccccgagtcagaccagaaagctaggtcgtgggtcagctctgaggatgtatacccctggtgggagaggga 

gacctagagatctggctgtggggcgggcatggggggtgaagggccactgggaccctcagccttgtttgtactgtatgccttcagcat 

tgcctaggaacacgaagcacgatcagtccatccagagggaccggagttatgacaagcttcccaaatattttgctttatcagccgatat 

caacacttgtatctggcctctgtgcccagcagtgccttgtgcaatgtgaatgtaccgtctctgctaaaccaccattttatttggMgtttt 
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gtttggttttctcggatacttgccaaaatgagactctccgtcggcagctgggggaagggtctgagactctctttccttttggttttgggatt 
acttttgatcctgggggaccaatgaggtgaggggggttctcctttgccctcagctttcccagccctccggcctgggctgcccacaag 
gcttctcccccagaggccctggctcctggtcgggaagggaggtgcctcccgccaacgcatcactggggctgggagcagggaag 
ggaattc (SEQ ID NO: 26) 

6.9. Cysteine-rich, AnaiotJenic Inducer, 61 
[0370] See, e.g.. Gen Bank Accession # XM_001831. 
General Target Regions: 

[0371] (1) 5 ' Untranslated Region (GenBank Accession No. 
gi|19200928|gb|BM844529.1|BM844529): 

agcgagagcgcccccgagcagcgcccgcgccctccgcgccttctccgccgggacctcgagcgaaagacgcccgcccgccgc 
ccagccctcgcctccctgcccaccgggcacaccgcgccgccaccccgaccccgctgcgcacggcctgtccgctgcacaccagc 
ttgttggcgtcttcgtcgccgcgctcgccccgggctactcctgcgcgccaca (SEQ ID NO: 27) 
[0372] (2) 3 ' Untranslated Region (687 nt) (GenBank Accession No. 
128983791emb|AL556057.11AL556057): 

taaatgctacctgggtttccagggcacacctagacaaacargggagaagagtgtcagaatcagaatcatggagaaaatgggcggg 
ggtggtgtgggtgatgggactcattgtagaaaggaagccttgctcattcttgaggagcattaaggtatttcgaaactgccaagggtgc 
tggtgcggatggacactaatgcagccacgattggagaatactttgcttcatagtattggagcacatgttactgcttcattttggagcttgt 
ggagttgatgactttctgttttctgtttgtaaattatttgctaagcatattttctctaggcttttttccttttggggttctacagtcgtaaaagaga 
taataagattagttggacagtttaaagcttttattcgtcctttgacaaaagtaaatgggagggcattccatcccttcctgaagggggaca 
ctccatgagtgtctgtgagaggcagctatctgcactctaaactgcaaacagaaatcaggtgttttaagactgaatgttttatttatcaaaa 
tgtagcttttggggagggaggggaaatgtaatactggaataatttgtaaatgattttaattttatattcagtgaaaagattttatttatggaat 



6.10. Basis Fibroblast Growth Factor 
[0373] See, e.g. , GenBank Accession No. NM_002006. 

General Target Regions: 

[0374] (1) 5' Untranslated Region: 

cggccccagaaaacccgagcgagtagggggcggcgcgcaggagggaggagaactgggggcgcgggaggctggtgggtgtc 
gggggtggagatgtagaagatgtgacgccgcggcccggcgggtgccagattagcggacggctgcccgcggttgcaacgggatc 
ccgggcgctgcagcttgggaggcggctctccccaggcggcgtccgcggagacacccatccgtgaaccccaggtcccgggccg 
ccggctcgccgcgcaccaggggccggcggacagaagagcggccgagcggctcgaggctgggggac (SEQ ID NO: 
29) 

[0375] (2) 3 ' Untranslated Region (5 . 8 kb) : 

ctgctaagagctgattttaatggccacatctaatctcatttcacatgaaagaagaagtatattttagaaatttgttaatgagagtaaaagaa 
aataaatgtgtatagctcagtttggataattggtcaaacaattttttatccagtagtaaaatatgtaaccattgtcccagtaaagaaaaata 




iaaaaaa(SEQIDNO: 28) 
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acaaaagttgtaaaatgtatattctcccttttatattgcatctgctgttacccagtgaagcttacctagagcaatgatctttttcacgcatttg 
ctttattcgaaaagaggcttttaaaatgtgcatgtttagaaacaaaatttcttcatggaaatcatatacattagaaaatcacagtcagatgtt 
taatcaatccaaaatgtccactatttcttatgtcattcgttagtctacatgtttctaaacatataaatgtgaatttaatcaattcctttcatagttt 
tataattctctggcagttccttatgatagagtttataaaacagtcctgtgtaaactgctggaagttcttccacagtcaggtcaattttgtcaa 
acccttctctgtacccatacagcagcagcctagcaactctgctggtgatgggagttgtattttcagtcttcgccaggtcattgagatcca 
tccactcacatcttaagcattcttcctggcaaaaatttatggtgaatgaatatggctttaggcggcagatgatatacatatctgacttccca 
aaagctccaggatttgtgtgctgttgccgaatactcaggacggacctgaattctgattttataccagtctcttcaaaaacttctcgaaccg 
ctgtgtctcctacgtaaaaaaagagatgtacaaatcaataataattacacttttagaaactgtatcatcaaagattttcagttaaagtagca 
ttatgtaaaggctcaaaacattaccctaacaaagtaaagttttcaatacaaattctttgccttgtggatatcaagaaatcccaaaatattttc 
ttaccactgtaaattcaagaagcttttgaaatgctgaatatttctttggctgctacttggaggcttatctacctgtacatttttggggtcagct 
ctttttaacttcttgctgctctttttcccaaaaggtaaaaatatagattgaaaagttaaaacattttgcatggctgcagttcctttgtttcttga 
gataagattccaaagaacttagattcatttcttcaacaccgaaatgctggaggtgtttgatcagttttcaagaaacttggaatataaataat 
tttataattcaacaaaggttttcacattttataaggttgatttttcaattaaatgcaaatttgtgtggcaggatttttattgccattaacatattttt 
gtggctgctttttctacacatccagatggtccctctaactgggctttctctaattttgtgatgttctgtcattgtctcccaaagtatttaggag 
aagccctttaaaaagctgccttcctctaccactttgctggaaagcttcacaattgtcacagacaaagatttttgttccaatactcgttttgc 
ctctatttttcttgtttgtcaaatagtaaatgatatttgcccttgcagtaattctactggtgaaaaacatgcaaagaagaggaagtcacaga 
aacatgtctcaattcccatgtgctgtgactglagactgtcttaccatagactgtcttacccatcccctggatatgctcttgttttttccctcta 
atagctatggaaagatgcatagaaagagtataatgUttaaaacataaggcattcatctgccatttttcaattacatgctgacttcccttac 
aattgagatttgcccataggttaaacatggttagaaacaactgaaagcataaaagaaaaatctaggccgggtgcagtggctcatgcct 
atattccctgcactttgggaggccaaagcaggaggatcgcttgagcccaggagttcaagaccaacctggtgaaaccccgtctctac 
aaaaaaacacaaaaaatagccaggcatggtggcgtgtacatgtggtctcagatacttgggaggctgaggtgggagggttgatcact 
tgaggctgagaggtcaaggttgcagtgagccataatcgtgccactgcagtccagcctaggcaacagagtgagactttgtctcaaaa 
aaagagaaattttccttaataagaaaagtaatttttactctgatgtgcaatacatttgttattaaatttattatttaagatggtagcactagtctt 
aaattgtataaaatatcccctaacatgtttaaatgtccatttttattcattatgctttgaaaaataattatggggaaatacatgtttgttattaaa 
tttattattaaagatagtagcactagtcttaaatttgatataacatctcctaacttgtttaaatgtccatttttattctttatgcttgaaaataaatt 
atggggatcctatttagctcttagtaccactaatcaaaagttcggcatgtagctcatgatctatgctgtttctatgtcgtggaagcaccgg 
atgggggtagtgagcaaatctgccctgctcagcagtcaccatagcagctgactgaaaatcagcactgcctgagtagttttgatcagtt 
taacttgaatcactaactgactgaaaattgaatgggcaaataagtgcttttgtctccagagtatgcgggagacccttccacctcaagat 
ggatatttcttccccaaggatttcaagatgaattgaaatttttaatcaagatagtgtgctttattctgttgtattttttattattttaatatactgta 
agccaaactgaaataacatttgctgttttataggtttgaagaacataggaaaaactaagaggttttgtttttatttttgctgatgaagagata 
tgtttaaatatgttgtattgttttgtttagttacaggacaataatgaaatggagtttatatttgttatttctattttgttatatttaataatagaatta 
gattgaaataaaatataatgggaaataatctgcagaatgtgggtttcctggtgtttcctctgactctagtgcactgatgatctctgataag 
gctcagctgctttatagttctctggctaatgcagcagatactcttcctgccagtggtaatacgattttttaagaaggcagtttgtcaatttta 
atcttgtggatacctttatactcttagggtattattttatacaaaagccttgaggattgcattctattttctatatgaccctcttgatatttaaaa 
aacactatggataacaattcttcatttacctagtattatgaaagaatgaaggagttcaaacaaatgtgtttcccagttaactagggtttact 
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gtttgagccaatataaatgtttaactgtttgtgatggcagtattcctaaagtacattgcatgttttcctaaatacagagtttaaataatttcagt 
aattcttagatgattcagcttcatcattaagaatatcttttgtl^tatgttgagttagaaatgccttcatatagacatagtctttcagacctc^^ 
tgtcagttttcatttctagctgctttcagggttttatgaattttcaggcaaagctttaatttatactaagcttaggaagtatggctaatgccaa 
cggcagtttttttcttcttaattccacatgactgaggcatatatgatctctgggtaggtgagttgttgtgacaaccacaagcacttttttttttt 
ttaaagaaaaaaaggtagtgaatttttaatcatctggactttaagaaggattctggagtatacttaggcctgaaattatatatatttggcttg 
gaaatgtgtttttcttcaattacatctacaagtaagtacagctgaaattcagaggacccataagagttcacatgaaaaaaatcaattcattt 
gaaaaggcaagatgcaggagagaggaagccttgcaaacctgcagactgctttttgcccaatatagattgggtaaggctgcaaaaca 
taagcttaattagctcacatgctctgctctcacgtggcaccagtggatagtgtgagagaattaggctgtagaacaaatggccttctcttt 
cagcattcacaccactacaaaatcatcttttatatcaacagaagaataagcataaactaagcaaaaggtcaataagtacctgaaacca 
agattggctagagatatatcttaatgcaatccattttctgatggattgttacgagttggctatataatgtatgtatggtattttgatttgtgtaa 
aagttttaaaaatcaagctttaagtacatggacatttttaaataaaatatttaaagacaatttagaaaattgccttaatatcattgttggctaa 
atagaataggggacatgcatattaaggaaaaggtcatggagaaataatattggtatcaaacaaatacattgatttgtcatgatacacatt 
gaatttgatccaatagtttaaggaataggtaggaaaatttggtttctatttttcgatttcctgtaaatcagtgacataaataattcttagcttat 
tttatatttccttgtcttaaatactgagctcagtaagttgtgttaggggattatttctcagttgagactttcttatatgacattttactatgttttga 
cttcctgactattaaaaataaatagtagaaacaattttcataaagtgaagaattatataatcactgctttataactgactttattatatttatttc 
aaagttcatttaaaggctactattcatcctctgtgatggaatggtcaggaatttgttttctcatagtttaattccaacaacaatattagtcgta 
tccaaaataacctttaatgctaaactttactgatgtatatccaaagcttctccttttcagacagattaatccagaagcagtcataaacagaa 
gaataggtggtatgttcctaatgatattatttctactaatggaataaactgtaatattagaaattatgctgctaattatatcagctctgaggta 
atttctgaaatgttcagactcagtcggaacaaattggaaaatttaaatttttattcttagctataaagcaagaaagtaaacacattaatttcc 
tcaacatttttaagccaattaaaaatataaaagatacacaccaatatcttcttcaggctctgacaggcctcctggaaacttccacatatttt 
tcaactgcagtataaagtcagaaaataaagttaacataactttcactaacacacacatatgtagatttcacaaaatccacctataattggt 
caaagtggttgagaatatattttttagtaattgcatgcaaaatttttctagcttccatcctttctccctcgtttcttctttttttgggggagctggt 
aactgatgaaatcttttcccaccttttctcttcaggaaatataagtggttttgtttggttaacgtgatacattctgtatgaatgaaacattgga 
gggaaacatctactgaatttctgtaatttaaaatattttgctgctagttaactatgaacagatagaagaatcttacagatgctgctataaat 
aagtagaaaatataaatttcatcactaaaatatgctattttaaaatctatttcctatattgtatttctaatcagatgtattactcttattatttctatt 
gtatgtgttaatgattttatgtaaaaatgtaattgcttttcatgagtagtatgaataaaattgattagtttgtgttttcttgtctcccgaaaaaaa 

6.11. Cvclia Dl 

[0376] See, e.g. , GenBank Accession No. NM_053056. 

General Target Regions: 

[0377] (1) 5' Untranslated Region: 

cggccccagaaaacccgagcgagtagggggcggcgcgcaggagggaggagaactgggggcgcgggaggctggtgggtgtc 
gggggtggagatgtagaagatgtgacgccgcggcccggcgggtgccagattagcggacggctgcccgcggttgcaacgggatc 
ccgggcgctgcagcttgggaggcggctctccccaggcggcgtccgcggagacacccatccgtgaaccccaggtcccgggccg 
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' 'ccggctcgccgcgcaccaggggcc (SEQ ID NO: 

31) 

[0378] (2) 3' Untranslated Region (3.2 kb): 

tgagggcgccaggcaggcgggcgccaccgccacccgcagcgagggcggagccggccccaggtgctcccctgacagtccctc 
ctctccggagcattttgataccagaagggaaagcttcattctccttgttgttggttgttttttcctttgctctttcccccttccatctctgactta 
agcaaaagaaaaagattacccaaaaactgtctttaaaagagagagagagaaaaaaaaaatagtatttgcataaccctgagcggtgg 
gggaggagggttgtgctacagatgatagaggattttataccccaataatcaactcgtttttatattaatgtacttgtttctctgttgtaagaa 
taggcattaacacaaaggaggcgtctcgggagaggattaggttccatcctttacgtgtttaaaaaaaagcataaaaacattttaaaaac 
atagaaaaattcagcaaaccatttttaaagtagaagagggttttaggtagaaaaacatattcttgtgcttttcctgataaagcacagctgt 
agtggggttctaggcatctctgtactttgcttgctcatatgcatgtagtcactttataagtcattgtatgttattatattccgtaggtagatgtg 
taacctcttcaccttattcatggctgaagtcacctcttggttacagtagcgtagcgtggccgtgtgcatgtcctttgcgcctgtgaccacc 
accccaacaaaccatccagtgacaaaccatccagtggaggtttgtcgggcaccagccagcgtagcagggtcgggaaaggccacc 
tgtcccactcctacgatacgctactataaagagaagacgaaatagtgacataatatattctatttttatactcttcctatttttgtagtgacct 
gtttatgagatgctggttttctacccaacggccctgcagccagctcacgtccaggttcaacccacagctacttggtttgtgttcttcttcat 
attctaaaaccattccatttccaagcactttcagtccaataggtgtaggaaatagcgctgtttttgttgtgtgtgcagggagggcagttttc 
taatggaatggtttgggaatatccatgtacttgtttgcaagcaggactttgaggcaagtgtgggccactgtggtggcagtggaggtgg 
ggtgtttgggaggctgcgtgccagtcaagaagaaaaaggtttgcattctcacattgccaggatgataagttcctttccttttctttaaaga 
agttgaagtttaggaatcctttggtgccaactggtgtttgaaagtagggacctcagaggtttacctagagaacaggtggtttttaagggt 
tatcttagatgtttcacaccggaaggtttttaaacactaaaatatataatttatagttaaggctaaaaagtatatttattgcagaggatgttca 
taaggccagtatgatttataaatgcaatctccccttgatttaaacacacagatacacacacacacacacacacacacacaaaccttctg 
cctttgatgttacagatttaatacagtttatttttaaagatagatccttttataggtgagaaaaaaacaatctggaagaaaaaaaccacaca 
aagacattgattcagcctgtttggcgtttcccagagtcatctgattggacaggcatgggtgcaaggaaaattagggtactcaacctaa 
gttcggttccgatgaattcttatcccctgccccttcctttaaaaaacttagtgacaaaatagacaatttgcacatcttggctatgtaattctt 
gtaatttttatttaggaagtgttgaagggaggtggcaagagtgtggaggctgacgtgtgagggaggacaggcgggaggaggtgtg 
aggaggaggctcccgaggggaaggggcggtgcccacaccggggacaggccgcagctccattttcttattgcgctgctaccgttg 
acttccaggcacggtttggaaatattcacatcgcttctgtgtatctctttcacattgtttgctgctattggaggatcagttttttgttttacaat 
gtcatatactgccatgtactagttttagttttctcttagaacattgtattacagatgccttttttgtagttttttttttttttatgtgatcaattttgact 
taatgtgattactgctctattccaaaaaggttgctgtttcacaatacctcatgcttcacttagccatggtggacccagcgggcaggttctg 
cctgctttggcgggcagacacgcgggcgcgatcccacacaggctggcgggggccggccccgaggccgcgtgcgtgagaacc 
gcgccggtgtccccagagaccaggctgtgtccctcttctcttccctgcgcctgtgatgctgggcacttcatctgatcgggggcgtagc 
atcatagtagtttttacagctgtgttattctttgcgtgtagctatggaagttgcataattattattattattattataacaagtgtgtcttacgtgc 
caccacggcgttgtacctgtaggactctcattcgggatgattggaatagcttctggaatttgttcaagttttgggtatgtttaatctgttatg 
tactagtgttctgtttgttattgttttgttaattacaccataatgctaatttaaagagactccaaatctcaatgaagccagctcacagtgctgt 
gtgccccggtcacctagcaagctgccgaaccaaaagaatttgcaccccgctgcgggcccacgtggttggggccctgccctggca 
gggtcatcctgtgctcggaggccatctcgggcacaggcccaccccgccccacccctccagaacacggctcacgcttacctcaacc 
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atcctggctgcggcgtctgtctgaaccacgcgggggccttgagggacgctttgtctgtcgtgatggggcaagggcacaagtcctgg 
atgttgtgtgtatcgagaggccaaaggctggtggcaagtgcacggggcacagcggagtctgtcctgtgacgcgcaagtctgaggg 
tctgggcggcgggcggctgggtctgtgcatttctggttgcaccgcggcgcttcccagcaccaacatgtaaccggcatgtttccagca 
gaagacaaaaagacaaacatgaaagtctagaaataaaactggtaaaaccccaaaaaaaaaaaaaaaa (SEQ ID NO: 32) 

6.12. Murine Double Minute 2 
[0379] See, e.g., GenBank Accession No. NM_002392. 
General Target Regions: 

[0380] (1) 5'End/Iatronl/p53BSfors-mdm-2:U39736: 

gcaccgcggcgagcttggctgcttctggggcctgtgtggccctgtgtgtcggaaagatggagcaagaagccgagcccgaggggc 

ggccgcgacccctctgaccgagatcctgctgctttcgcagccaggagcaccgtccctccccggattagtgcgtacgagcgcccag 

tgccctggcccggagagtggaatgatccccgaggcccagggcgtcgtgcttccgcgcgccccgtgaaggaaactggggagtctt 

gagggacccccgactccaagcgcgaaaaccccggatggtgaggagcaggtactggcccggcagcgagcggtcacttttggg+Qi 

tgggctctgacggtgtcccctctatcgctggttcccagcctctgcccgttcgcagcctttgtgcggttcgtgnctgggggctcppgF 

gcggggcgcggggcatgggncacgtggctttgcggaggttttgttggactggggctagacagtccccgccagggaggagggcg 

ggatttcggacggctctcgcggcggtgggggtgggggtggttcggaggtctccgcgggagttcagggtaaaggtcacggggcc 

ggggctgcgggccgcttcggcgcgggaggtccggatgatcgcagtgcctgtcgggtcactagtgtgaacgctgcgcgtagtctg 

ggcgggattgggccggttcagtgggcaggttgactcagcttttcctcttgagctggtcaagttcagacacgttccgaaactgcagtaa 

aaggagttaagtcctgacttgtctccagctggggctatttaaaccatgcattttcccagctgtgttCAGTGGCGATTGGAG 

GGTAGACCTGTGGGCACGGACGCACGCCACTTTTTCTCTGCTGATCCAGgtaagcac 

cgacttgcttgtagctttagttttaactgttgtttatgttctttatatatgatgtattttccacagatgtttcatgaW^ 

tttttccttgtaggcaaatgtgcaataccaacatgtctgtacc (SEQ ID NO: 33) 

[0381] (2) 3 'UTR (GenBank Accession No. 

gi|9150029|gb|BE275079.1|BE275079): 

tagttgacctgtctataagagaattatatatttctaactatataaccctaggaatttagacaacctgaaatttattcacatatatcaaagtga 

gaaaatgcctcaattcacatagatttcttctctttagtataattgacctactttggtagtggaatagtgaatacttactalaatttgactt^ 

atgtagctcatcctttacaccaactcctaallttaaataatttctactctgtcttaaatgagaagtacttg glUUlUll cttaaatatgtata 

acatttaaatgtaacttattattttttttgagaccgagtcttgctctgttacccaggctggagtgcagtgggtgatcttggctcactgc^^ 

ctctgccctccccgggttcgcaccattctcctgcctcagcctcccaattagcttggcctacagtcatctgccaccacacctggctaattt 

tttgtacttttagtagagacagggtttcaccgtgttagccaggatggtctcgatctcctgacctcgtgatccgcccacctcggcctccca 

aagtgctgggattacaggcatgagccaccgtgctctccagcctaggcaacagagtgagactctgtctccaaaaaaaaaaaaaaaaa 

aaggggactataacacccccagggaaagggacaggtgggacattcttattcttaatttaaataaattgacaggggaaagttgggcca 

ctcttgagcttgtgggtgctcaccaggttgaccccaaaaaaagaagccttccacaaaacattaatttatttccctaatatacccgcctct 

gtgagttaagggataatgcatcaggactcttgcaaccagacaaaattatttaaaaacgccacttgggggggaggcgggtccctcctg 

gggattcgcctttgtgggagagaaaactgcacagacttgggcaaataatgttttttgtcaccccaaaacgtattcgcgagacatttcatt 
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agaacgaagctttaccctaatattgaactccccatttaaacagtttccacacacacttagggagatttttccctctgtgagttccgcagaa 
caatagttggacgggaatagaaccctgaaacactttagttcaccacgaactattatagggcggg (SEQ ID NO: 34) 



6.13. Protein Tyrosine Phosphatase Type IV A, Member 3 
[0382] See, e.g. , GenBank Accession No. NM_03261 1 . 
General Target Regions: 

[0383] (1) 5' Untranslated Region: 

tgactatccagctctgagagacgggagtttggagttgcccgctttactttggttgggttggggggggcggcgggctgttttgttcctttt 
cttttttaagagttgggttttcttttttaattatccaaacagtgggcagcttcctcccccacacccaagtatttgcacaatatttgtgcggggt 
atgggggtgggtttttaaatctcgtttctcttggacaagcacagggatctcgttctcctcattttttgggggtgtgtggggacttctcaggt 
cgtgtccccagccttctctgcagtcccttctgccctgccgggcccgtcgggaggcgcc (SEQ ED NO: 35) 
[0384] (2) 3' Untranslated Region: 

tagctcaggaccttggctgggcctggtcgtcatgtaggtcaggaccttggctggacctggaggccctgcccagccctgctctgccc 

agcccagcaggggctccaggccttggctggccccacatcgccttttcctccccgacacctccgtgcacttgtgtccgaggagcgag 

gagcccctcgggccctgggtggcctctgggccctttctcctgtctccgccactccctctggcggcgctggccgtggctctgtctctct 

gaggtgggtcgggcgccctctgcccgccccctcccacaccagccaggctggtctcctctagcctgtttgttgtggggtgggggtat 

attttgtaaccactgggcccccagcccctcttttgcgaccccttgtcctgacctgttctcggcaccttaaattattagaccccggggcag 

6.14. Tissue Inhibitor of Metalioproteinase 

[0385] See, e.g. , GenBank Accession No. XM_003061 for TIMP-4. 

General Target Regions: 

[03 86] ( 1 ) 5 ' Untranslated Region (GenBank Accession No . 
gi|l 1293824|gb|BF346229. 1 [BF346229); 

gctcagcaaggggtccgtccttctctgtcactgtctcttttgcctgttgtaattctgtctgcctctctgggactctgcctgtctcactctttct 

gtctgtgcctctcctcactcttgttctttctgcctgaatcacagccctcagtttttctgtcctcatgcatttgtctttgtggctctttccgtctttc 

tgcccttgacaccatcccctctcccagtgcttcccctctgcttccagatcgcttcatgacttaggcagggaaacagaggtcagggcct 

ccttccaggcttccctctgcatcttactgagtatgcaggtcggaagagcctcgggtcctgcctccgcgggtggcctagagccaaagg 

aaggcggagcccgtcggggcgggattggcccttagggccacctcataaagcctggggcgaggggcacaacggccttgggaag 

gagccctgctggggccgtccagtcccccagacctcacaggctcagtcgcggatctgcagtgtc (SEQ ID NO: 37) 

[0387] (2) 3' Untranslated Region: 

tagtagggaccagtgaccatcacatcccttcaagagtcctgaagatcaagccagttctccttccctgcagagctttggccattaccac 
ctgacctcttgctgccagctaataagaagtgccaagtggacagtctggccactgtcaaggcagggaaggggccatgacttttctgcc 
ctgccctcagcctgttgccctgcctcccaaaccccattagtctagccttgtagctgttactgcaagtgtttcttctggcttagtctgttttct 
aaagccaggactattccctttcctccccaggaatatgtgttttcctttgtcttaatcgatctggtaggggagaaatggcgaatgtcataca 
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catgagatggtatatccttgcgatgtacagaatcagaaggtggtttgacagcatcataaacaggctgactggcaggaatgaaaaaaa 
aaaaaaaaaaaa (SEQ ID NO: 38) 

[0388] See, e.g. , GenBank Accession No. S48568 for TIMP-2. 
General Target Regions: 

[0389] (1) 5 ' Untranslated Region (84 % GC-rich): 

ggggccgccgagagccgcagcgccgctcgcccgccgccccccaccccgccgccccgcccggcgaattgcgccccgcgccct 

cccctcgcgcccccgagacaaagaggagagaaagtttgcgcggccgagcgggcaggtgaggagggtgagccgcgcggagg 

ggcccgcctcggccccggctcagcccccgcccgcgcccccagcccgccgccgcgagcagcgcccggaccccccagcggcg 

gccccgcccgcccagccccccggcccgcc (SEQ ID NO: 39) 

[0390] (2) 3 ' Untranslated Region (GenBank Accession No. 

1 850597 1 |gb|BM45693 1, 1 IBM45693 1): 

taagcaggcctccaacgcccctgtggccaactgcaaaaaaagcctccaagggtttcgactggtccagctctgacatcccttcctgga 
aacagcatgaataaaacactcatcccatgggtccaaattaatatgattctgctccccccttctccttttagacatggttgtgggtctggag 
ggagacgtgggtccaaggtcctcatcccatcctccctctgccaggcactatgtgtctggggcttcgatccttgggtgcaggcagggc 
tgggacacgcggcttccctcccagtccctgccttggcaccgtcacagatgccaagcaggcagcacttagggatctcccagctgggt 
tagggcagggcctggaaatgtgcattttgcagaaacttttgagggtcgttgcaagactgtgtagcaggcctaccaggtccctttcatct 
tgagagggacatggccccttgttttctgcagcttccacgcctctgcactccctgcccctggcaagtgctcccatcgcccccggtgccc 
accatgnagctccccgcacctgactccccccacatccaagggcagccctggaaccagtgggctagttccttgaaggaagccccac 
tcattcctattaatccctcagaattcccggggggagccttccctcctgaaccttggtaaaaaatggggaacgagaaaaacccccgctt 
ggagctgtgcgtttccagcccctacttgagagncttttttttgggggccg (SEQ ID NO: 40) 

6.1 5. Peroxisome Proliferative Activated Receptor-g " 
[0391] See, e.g., GenBank Accession No. NM_138712. 
General Target Regions: 

[0392] (1) 5 ' Untranslated Region (GenBank Accession No. 

12786927|emb|AL523434.1|AL523434): 
cgcgccgggcccggctcggcccgacccggctccgcgcgggcaggcggggcccagcgcactcggagcccgagcccgagccg 
cagccgccgcctggggcgcttgggtcggcctcgaggacaccggagaggggcgccacgccgccgtggccgcagatttgaaaga 
agccgacactaaaccaccaatatacaacaaggccattttgtcaaacgagagtcagcctttaacgaaa (SEQ ID NO: 41) 
[0393] (2) 3' Untranslated Region: 

tagcagagagtcctgagccactgccaacatttcccttcttccagttgcactattctgagggaaaatctgacacctaagaaatttactgtg 
aaaaagcattttaaaaagaaaaggttttagaatatgatctattttatgcatattgtttataaagacacatttacaatttacttttaatattaaaa 
attaccatattatgaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa (SEQ ID NO: 42) 

6.16. PC-cell Derived Growtli Factor/Epithelin/Granulin Precursor 
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[0394] See, e.g. , GenBank Accession No. NM[_002087. 

General Target Regions: 

[0395] (1) 5' Untranslated Region: 

ggcacgaggggcgagaggaagcagggaggagagtgatttgagtagaaaagaaacacagcattccaggctggccccacctctat 

attgataagtagccaatgggagcgggtagccctgatccctggccaatggaaactgaggtaggcgggtcatcgcgctggggtctgta 

gtctgagcgctacccggttgctgctgcccaaggaccgcggagtcggacgcaggcagaccatgtggaccctggtgagctgggtgg 

ccttaacagcagggctggtggctggaacgcggtgcccagatggtcagttctgccctgtggcctgctgcctggaccccggaggagc 

cagctacagct (SEQ ID NO: 43) 

[0396] (2) 3' Untranslated Region: 

tgagggacagtactgaagactctgcagccctcgggaccccactcggagggtgccctctgctcaggcctccctagcacctcccccta 
accaaattctccctggaccccattctgagctccccatcaccatgggaggtggggcctcaatctaaggccttccctgtcagaaggggg 
ttgtggcaaaagccacattacaagctgccatcccctccccgtttcagtggaccctgtggccaggtgcttttccctatccacaggggtgt 

44) 

6.17. Angiogenin 
[0397] See, e.g., GenBank Accession No. Ml 1 567. 
General Target Regions: 
[0398] (1) 5' Untranslated Region: 

tgtttgcattaagttcatagattataatttgtaatggaatcaacaccaaatgcaaattagaaagagagcccactttgctcacccagtcacg 

tcttcccatgtaaccatagaacgttggggtcctgtgtctttctagatccacagtcttgctctcagaacaggctagccacaccacaggcc 

tagtgccaggacccatggcctttttttaagctcagactcccttctgtgaacagcaatatccccacaacttgtacaacattggtgcttcctg 

caagggctacagaactatttgatacgaaaatgttcattgacttacacacaagagaagcacaaaataaaaaattaataattaatttaatgt 

ctttgaaaatgtaccatttatttttacatttggggtcataagaattgtattacacttaagaatgcaatacaatttgaagatcagatttttctccc 

tttgtgagaatttctcagtatgtgtgatgactaccaagaaatcatagccagtcataaattcagtgagttactcataaacgaacaagaacc 

acctacttcttggggaggtaggtctgcttcccttcaactcaggatacaactgctttcaactgctttcttcacattagctgactaattagcta 

gaagcctgtcgtaaacaattttatggttgactccttccctgggctcagggttccctagaacagagaggtccccaaatcccggtctgtg 

gcctgtccgcctaagctctgcctcctgccagatcagcaggcagcattagattctcataggagctggacgcctattgtgaactgcgcat 

gtgcgggatccagattgtgcactctttatgagaatctaactaatgcttgatgatctatctgaaccagaacaatttcatcctgaaaccatcc 

cccaccaatccatagaaatactgtcttccacaaaaatgatccctggtgccaaaaatgttagagaccactcccctaaaactctcttcttag 

ctctcacctcctgtattactatctcatctcagtacattgaagcccccatcttttccccatggatgcctcatttcctattagggaggcattttttt 

attttttgtttttatttttttccgagacggagtctcgctctgtcgccaaggctggagtgcagtggcgcgatctcggctcactgcaagctcc 

gcctcccgggttcacgccattctcctgcctcagcctcccaagtagctgggactacaggcgcccgcactacgcccggctaattttttgt 

atttttagtagagacggggtttcaccgtggtagccaggatggtctcgatctcctgacctcgtgatccgcccgccttggcctcccaaagt 

gctgggattacaggcgtgagaccgcgcccggccgtcatttggtatgtcttaatgtgcctcaggacctagcacagtccctggtaccca 

gtagagacctatgtaatgttcgttattcaataataaatacatgaattaaagagtgagagtggattttgtaatgttacgactgatagagaaa 
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' tactcagtgattctaagggatggggaagaacggttggagctagaggttgtgctcaggaaactattaaatagacgttccgcaggaagg 

gattgacgaagtgtgaggttaatgaggaagggaaaatagaatataaaatttggtggtggaaaagatctgattcatga(SEQID 

NO: 45) 

[03991 (2) 3' Untranslated Region: 

taaccagcgggcccctggtcaagtgctggctctgctgtccttgccttccatttcccctctgcacccagaacagtggtggcaacattcat 

tgccaagggcccaaagaaagagctacctggaccttttgttttctgtttgacaacatgtttaataaataaaaatgtcttgatatcagtaaga 

atcagagtcttctcactgattctgggcatattgatctttcccccattttctctacttggctgctccctgagaggactgcataggatagaaat 

gcctttttcttttcttttcgtttttttttttttttttttttgagatggagtctcactctgtcgcccaggcttaagtgcaatggcacaatctcggctca 

ctgcaacctctctctcctgggttcaagtgattctcctgcctcagcctcccaaatagctgagattacaggcatgcaccaccacacctggc 

taatttttgtgtttttagtagagacagggtttcaccgttttggccaggttggtcttgaactcctgacctcgggagatccgcccaccttggc 

ctctctttgtgctgggattacaggcatgagccactgagccgggccactttttccttatcagtcagtttttacaagtcattagggaggtaga 

ctttacctctctgtgaaggaaagtatggtatgttgatctacagagagagatggaaaaattccagggctcgtagctactaagcagaattt 

ccaagataggcaaattgttttttctgtcaaataataagctaatattacttctacaaatatgagaccttggagagaagtttccaaggaccaa 

gtaccaacataccaacagattattatagtttctctcactcttacacacacacacacacatatacacatatgtaatccagcatgaataccaa 

aattcattcagggtagccaccttttgtcttaatcgagagataattttgatgtttgaatggaatgctcccaggatattctcttgtcatggttattt 

tatataaaattcaaaaaccaattacattatttcctctgtaatcttttactttatcaactaatgtctggcaagtgtgatgttttggggaagttata 

gaagattccggccaggcgcttatctcacgcttgtaatccagcactttgggaagctgaggcggacagatcacgaggtcaagagatca 

agaccatcctggacaacatggtgaaaccttgtctctactaaaaatgtgaaaattagctgggcgtggtggcacacacctatagtcccag 

ctactcgggaggctgaggcaggagaatcgcttgaacctaggaggcggaggttgcactgagccgagatcacgccactgcactcca 

gcctgggcgacagagcgagactccatctcaaaaaaaaaaaaaaaagaaagatcccagtttatcccagtttatcccttattcttcctcaa 

ttctcaagatttgtttttaagttaacataacttaggttaacacactctttgtaaaatacactgttcaatctacagactcagtggttagcttcctg 

ttaactaatttctgttgacaggtacttggatattttatttagaaagtggttgccaataaattagttataagtcgccagtttcactgccttgtga 

acacataattattgtggtctcagtattccctatggtggcttctcctgctcctggtattgccctgaaatgggccaaaagccgtggctcccc 

aatgctcaggttatagaacattgtccaggtaccacctaggagagcccagcctcactgaaagtattcaaatttaggaatgggtttgaga 

agtaggtagctggtatgtgcttagcacaagaatctctcttccttgggttagtctgtttcaaaactgaaaacactgtcattccttaagaaaat 

aggaaaaagtattccaaacctctgtcactagaaaatttgccatattaccaaatctcaaaaacctctcaggaaatgagaaagtcccagtt 

tctggtaaactatttgggcccttttctcaagttctccttccagtgctatttccttgaggtgaggcaaagttactcaagatcatcgctgccac 

tcaaggccttgatagggcaagtgaaaggcatggaccattattatattgatcacagcataagctgtgaaaacccacatcttctccaaac 

atctgcttggagcattatcatcgcatagtttgctctggtgttcagggaaatcgctgtttcataggaaatcacatggcagtgggatggga 

gtgtttcctgacctgccgatggtactggcacctgagcaagcattcctagtcctttttggtctgggcctcttgttctatcacaaccacaagc 

tgtttaaaataaaaacgtcaagtcacaggcaggtcattttatcctgcgtgaatcaattgaag (SEQ ID NO: 46) 

6.18. Hypoxia-inducible Factor-a 
[0400] See, e.g. , GenBank Accession No. U2243 1 . 
General Target Regions: 
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[0401] (1) 5' Untranslated Region: 

tcctcagtgcacagtgctgcctcgtctgaggggacaggaggatcaccctcttcgtcgcttcggccagtgtgtcgggctgggccctga 
caagccacctgaggagaggctcggagccgggcccggaccccggcgattgccgcccgcttctctctagtctcacgaggggtttccc 
gcctcgcacccccacctctggacttgcctttccttctcttctccgcgtgtggagggagccagcgcttaggccggagcgagcctggg 

ggccgcccgccgtgaagacatcgcggggaccgattcacc (SEQ ID NO: 47) 
[0402] (2) 3 ' Untranslated Region (GenBank Accession No. 

gi| 1 9 1 1 6743 |gb|BM799920. 1 |BM799920) : 
tgagctttttcttaatttcattcctttttttggacactggtggctcactacctaaagcagtctatttatattttctacatctaattttagaagcctg 
gctacaatactgcacaaacttggttagttcaatttttgatcccctttctacttaatttacattaatgctcttttttagtatgttctttaatgctggat 
cacagacagctcattttctcagttttttggtatttaaaccattgcattgcagtagcatcattttaaaaaatgcacctttttatttatttatttttgg 
ctagggagtttatccctttttcgaattatttttaagaagatgccaatataatttttgtaagaaggcagtaacctttcatcatgatcataggca 
gttgaaaaatttttacaccttttttttcacattttacataaataataatgctttgccagcagtacgtggtagccacaattgcacaatatattttct 
taaaaaataccagcagttactcatggaatatattctgcgtttataaaactagtttttaagaagaaattttttttggcctatgaaattgttaaac 
ctggaacatgacattgttaatcatataataatgattcttaaatgctgtatggtttattatttaaatgggtaaagccatttacataatatagaaa 
gatatgcatatatctagaaggtatgtggcatttatttggataaaattctcaattcagagaaatcatctgatgtttctatagtcactttgccag 
ctcaaaagaaaaoaataccctatgtagttgtggaagtttatgctaatattgtgtaactgatattaaacctaaatgttctgcctaccctgttg 
gtataaagatattttgagcagactgtaaacaagaaaaaaaaaatcatgcattcttagcaaaattgcctagtatgttaatttgctcaaaata 
caatgtttgattttatgcactttgtcgctattaacatcctttttttcatgtagatttcaataattgagtaattttagaagcattattttaggaatata 
tagttgtcacagtaaatatcttgttttttctatgtacattgtacaaatttttcattccttttgctctttgtggttggatctaacactaactgtattgtt 

6.19. Large Tumor Suppressor. Homolog 1 
[0403] See, e.g. , GenBank Accession No. XM[_01 5547. 
General Target Regions: 

[0404] (1) 5' Untranslated Region (GenBank Accession No. 

gi|19008744|gb|BM695486.1|BM695486): 
agacagccttaacccacgggcgcgggcgagtcgtatgggcaggggcaggcgggagcgacgtggggcgacgctcacgaacga 
tcagagctgcgggcgacgcaacgaagcccggaggccgcaggctgcgcgctccctcgcagcagccgggcgggcaaaagcccc 
cagtcctcggcccccgcgcaagcgacgccgggaaa (SEQ ID NO: 49) 
[0405] (2) 3 ' Untranslated Region (GenBank Accession No. 

gi|12274655|gb|BF884528.1|BF884528): 

taattatttatattgtaaagaattttaacagtcctggggacttccttgaaggatcattttcacttttgctcagaagaaagctctggatctatca 
aataaagaagtccttcgtgtgggctacatatatagatgttttcatgaagaggagtgaaaagccagaaggatatagacaaatgaggcct 
aagacctttcctgccagtaactatactgtcagtagccggcaaatgttacaagaaattcgggaatcccttaggaatttatctaaaccatct 
gatgctgctaaggctgagcataacatgagtaaaatgtcaaccgaagatcctcgacaagtcagaaatccacccaaatttgggacgcat 
cataaagccttgcaggaaattcgaaactctctgcttccatttgcaaatgaaacaaattcttctcggagtacttcagaagttaatccacaa 

-143- 



wo 2004/065561 



PCT/US2004/001643 



atgcttcaagacttgcaagctgctggatttgatgaggatatggttatacaagctcttcagaaaactaacaacagaagtatagaagcag 

caattgaattcattagtaaaatgagttaccaagatcGtcgacgagagcagatggctgcagcagctgccagacctattaatgccagcat 

gaaaccagggaatgtgcagcaatcagttaaccgcaaacagagctggaaaggttctaaagaatccttagttcctcagaggcatggcc 

cgccactaggagaaagtgtggcctatcattctgagagtcccaactcacagacagatgtaggaagacctttgtctggatctggtatatc 

agcatttgttcaagctcaccctagcaacggacagagagtgaaccccccaccaccacctcaagtaaggagtgttactcctccaccac 

ctccaagaggccagactccccctccaagaggtacaactccacctcccccttcatgggaaccaaactctcaaacaaagcgctattctg 

gaaacatggaatacgtaatctcccgaatctctcctgtcccacctggggcatggcaagagggctatcctccaccacctctcaacacttc 

ccccatgaatcctcctaatcaaggacagagaggcattagttctgttcctgttggcagacaaccaatcatcatgcagagttctagcaaat 

ttaactttccatcagggagacctggaatgcagaatggtactggacaaactgatttcatgatacaccaaaatgttgtccctgctggcact 

gtgaatcggcagccaccacctccatatcctctgacagcagctaatggacaaagcccttctgctttacaaacagggggatctgctgct 

ccttcgtcatatacaaatggaagtattcctcagtctatgatggtgccaaacagaaatagtcataacatggaactatataacattagtgtac 

ctggactgcaaacaaattggcctcagtcatcttctgctccagcccagtcatccccgagcagtgggcatgaaatccctacatggcaac 

ctaacataccagtgaggtcaaattcttttaataacccattaggaaatagagcaagtcactctgctaattctcagccttctgctacaacagt 

cactgcaattacaccagctcctattcaacagcctgtgaaaagtatgcgtgtattaaaaccagagctacagactgctttagcacctacac 

acccttcttggataccacagccaattcaaactgttcaacccagtccttttcctgagggaaccgcttcaaatgtgactgtgatgccacctg 

ttgctgaagctccaaactatcaaggaccaccaccaccctacccaaaacatctgctgcaccaaaacccatctgttcctccatacgagtc 

aatcagtaagcctagcaaagaggatcagccaagcttgcccaaggaagatgagagtgaaaagagttatgaaaatgttgatagtggg 

gataaagaaaagaaacagattacaacttcacctattactgttaggaaaaacaagaaagatgaagagcgaagggaatctcgtattcaa 

agttattctcctcaagcatttaaattctttatggagcaacatgtagaaaatgtactcaaatctcatcagcagcgtctacatcgtaaaaaac 

aattagagaatgaaatgatgcgggttggattatctcaagatgcccaggatcaaatgagaaagatgctttgccaaaaagaatctaatta 

catccgtcttaaaagggctaaaatggacaagtctatgtttgtgaagataaagacactaggaataggagcatttggtgaagtctgtctag 

caagaaaagtagatactaaggctttgta.tgcaacaaaaactcttcgaaagaaagatgttcttcttcgaaatcaagtcgctcatgttaagg 

ctgagagagatatcctggctgaagctgacaatgaatgggtagttcgtclBlBttattcattccaagataaggacaatttatactttgtaat 

ggactacattcctgggggtgatatgatgagcctattaattagaatgggcatctttccagaaagtctggcacgattctacatagcagaac 

ttacctgtgcagttgaaagtgttcataaaatgggttttattcatagagatattaaacctgataatattttgattgatcgtgatggtcatattaa 

attgactgactttggcctctgcactggcttcagatggacacacgattctaagtactatcagagtggtgaccatccacggcaagatagc 

atggatttcagtaatgaatggggggatccctcaagctgtcgatgtggagacagactgaagccattagagcggagagctgcacgcca 

gcaccagcgatgtctagcacattctttggttgggactcccaattatattgcacctgaagtgttgctacgaacaggatacacacagttgt 

gtgattggtggagtgttggtgttattctttttgaaatgttggtgggacaacctcc11i:cttggcacaaacaccattagaaa 

ggtcacctgctgctatatacatcattggctcgagaagaaactactgaacaccctgcgagagagaagcctagaaaagaaagaaagg 

gccaaaaggttttgaactcttcatccctaatttgctacactgatcaaaaccaagtaagggctcctgaagtccatgagtctatcatcaatc 

agcacaaatgctatactagtttgtaactgcggggtcagttgtgaaggggaaggacagcagtcttatccatattccaggaagccacagt 

aaactgctcga (SEQ ID NO: 50) 

6.20. P-Glvcoprotein 

[0406] See, e.g. , GenBank Accession No. M14758. 

General Target Sequences: 

[0407] (1) 5' Untranslated Region: 

cctactctattcagatattctccagattcctaaagattagagatcatttctcattctcctaggagtactcacttcaggaagcaaccagataa 
aagagaggtgcaacggaagccagaacattcctcctggaaattcaacctgtttcgcagtttctcgaggaatcagcattcagtcaatccg 
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ggccgggagcagtcatctgtggtgaggctgattggctgggcaggaacagcgccggggcgtgggctgagcacagcgcttcgctct 
ctttgccacaggaagcctgagctcattcgagtagcggctcttccaagctcaaagaagcagaggccgctgttcgtttcctttaggtcttt 
ccactaaagtcggagtatcttcttccaagatttcacgtcttggtggccgttccaaggagcgcgaggtcggg (SEQ ID NO; 
51) 

[0408] (2) 3 ' Untranslated Region (GenBank Accession No.: 

gi|13334786|gb|BG428280.1|BG428280): 
tgaactctgactgtatgagatgttaaatactttttaatatttgtttagatatgacatttattcaaagttaaaagcaaacacttacagaattatga 
agaggtatctgtttaacatttcctcagtcaagttcagagtcttcagagacttcgtaattaaaggaacagagtgagagacatcatcaagtg 
gagagaaatcatagtttaaactgcattataaattttataacagaattaaagtagattttaaaagataaaatgtgtaattttgtttatattttccc 
atttggactgtaactgactgccttgctaaaagattatagaagtagcaaaaagtattgaaatgtttgcataaagtgtctataataaaactaa 
actttcatgtgactggagtcatcttgtccaaactgcctgtgaatatatcttctctcaattggaatattgtagataacttctgctttaaaaaagt 
tttctttaaatatacctactcatttttgtgggaatggttaagcagtttaaataattcctgtgtatatgtctatcacataggggtctaacagaac 
aatctggattcattatttctaggacttgatcctgctgatgctgaatttgcacattaaggtgtgttaacaaccaaaacacagatcgatataa 
gaagtaaggaggtggggagaggcaaattatgatgtgctatgagttagatgtatagt (SEQ ID NO: 52) 

6.21. CD82 Antigen 
[0409] See, e.g. , GenBank Accession No. NM_00223 1 
General Target Regions: 

[0410] (1) 5' Untranslated Region (GenBank Accession No. 

gi| 19088880|gb|BM759265. 1 |BM759265): 
agtccgcggcgttccccggctgcagccgggagggggccgaggagtgactgagccccgggctgtgcagtccgacgccgactga 
ggcacgagcgggtgacgctgggcctgcagcgcggagcagaaagcagaacccgcagagtcctccctgctgctgtgtggacgaca 
cgtgggcacaggcagaagtgggccctgtgaccagctgcactggtttcgtggaaggaagctccaggactggcggg(SEQID 
NO: 53) 

[0411] (2) 3' Untranslated Region: 
tgaggcagctgctatcccc 

atctccctgcctggcccccaacctcagggctcccaggggtctccctggctccctcctccaggcctgcctcccacttcactgcgaaga 
ccctcttgcccaccctgactgaaagtagggggctttctggggcctagcgatctctcctggcctatccgctgccagccttgagccctgg 
ctgttctgtggttcctctgctcaccgcccatcagggttctcttatcaactcagagaaaaatgctccccacagcgtccctggcgcaggtg 
ggctggacttctacctgccctcaagggtgtgtatattgtataggggcaactgtatgaaaaattggggaggagggggccgggcgcgg 
tgctcacgcctgtaatcccagcactttgggaggccgaggcgggtggatcacgaggtcaggagatcgagaccatcctggctaacat 

tgaggcaggagaatggtgtgaacccgggagcggaggttgcagtgagctgagatcgtgctactgcactccagcctgggggacaga 
aagagactccgtctcaa (SEQ ID NO: 54) 



6.22. 6.22. Bcl-2 
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[0412] See, e.g. , GenBank Accession No. M14745 
General Target Regions: 

[0413] (1) 5' Untranslated Region (GenBank Accession No. 

gi| 19887364|gb|BQ061909. 1 |BQ06 1 909) : 
tttctgtgaagcagaagtctgggaatcgatctggaaatcctcctaatttttactccctctccccccgactcctgattcattgggaagtttca 
aatcagctataactggagagagctgaagattgatgggatcgttgccttatgcctttgttttggttttacaaaaaggaaacttgacagagg 
atcatgctatacttaaaaaatacaacatcgcagaggaagtagactcatattaaaaatacttactaataataacgtgcctcatgaagtaaa 
gatccgaaaggaattggaataaaactttcctgcatctcaagccaagggggaaacaccagaatcaagtgttccgcgtgattgaagac 
accccctcgtccaagaatgcaaagcacatccaataaaagagctggattataactcctcttctttctctgggggccgtggggtgggag 
ctggggcgagaggtgccgttggcccccgttgcttttcctctgggaggg (SEQ ID NO: 55) 
[0414] (2) 3' Untranslated Region: 

tgaagtcaacatgcctgccccaaacaaatatgcaaaaggttcactaaagcagtagaaataatatgcattgtcagtgatgttccatgaaa 

caaagctgcaggctgtttaagaaaaaataacacacatataaacatcacacacacagacagacacacacacacacaacaattaacag 

tcttcaggcaaaacgtcgaatcagctatttactgccaaagggaaatatcatttattttttacattattaagaaaaaaagatttatttatttaag 

acagtcccatcaaaactcctgtctttggaaatccgaccactaattgccaagcaccgcttcgtgtggctccacctggatgttctgtgcct 

gtaaacatagattcgctttccatgttgttggccggatcaccatctgaagagcagacggatggaaaaaggacctgatcattggggaag 

ctggctttctggctgctggaggctggggagaaggtgttcattcacttgcatttctttgccctgggggctgtgatattaacagagggagg 

gttcctgtggggggaagtccatgcctccctggcctgaagaagagactctttgcatatgactcacatgatgcatacctggtgggagga 

aaagagttgggaacttcagatggacctagtacccactgagatttccacgccgaaggacagcgatgggaaaaatgcccttaaatcat 

aggaaagtatttttttaagctaccaattgtgccgagaaaagcattttagcaatttatacaatatcatccagtaccttaagccctgattgtgt 

atattcatatattttggatacgcaccccccaactcccaatactggctctgtctgagtaagaaacagaatcctctggaacttgaggaagtg 

aacatttcggtgacttccgcatcaggaaggctagagttacccagagcatcaggccgccacaagtgcctgcttttaggagaccgaagt 

ccgcagaacctgcctgtgtcccagcttggaggcctggtcctggaactgagccggggccctcactggcctcctccagggatgatca 

acagggcagtgtggtctccgaatgtctggaagctgatggagctcagaattccactgtcaagaaagagcagtagaggggtgtggctg 

ggcctgtcaccctggggccctccaggtaggcccgttttcacgtggagcatgggagccacgacccttcttaagacatgtatcactgta 

gagggaaggaacagaggccctgggcccttcctatcagaaggacatggtgaaggctgggaacgtgaggagaggcaatggccac 

ggcccattttggctgtagcacatggcacgttggctgtgtggccttggcccacctgtgagtttaaagcaaggctttaaatgactttggag 

agggtcacaaatcctaaaagaagcattgaagtgaggtgtcatggattaattgacccctgtctatggaattacatgtaaaacattatcttg 

tcactgtagtttggttttatttgaaaacctgacaaaaaaaaagttccaggtgtggaatatgggggttatctgtacatcctggggcattaaa 

aaaaaaatcaatggtggggaactataaagaagtaacaaaagaagtgacatcttcagcaaataaactaggaaatttttttttcttccagttt 

agaatcagccttgaaacattgatggaataactctgtggcattattgcattatataccatttatctgtattaactttggaatgtactctgttcaa 

tgtttaatgctgtggttgatatttcgaaagctgctttaaaaaaatacatgcatctcagcgtttttttgtttttaattgtatttagttatggcctata 

cactatttgtgagcaaaggtgatcgttttctgtttgagatttttatctcttgattcttcaaaagcattctgagaaggtgagataagccctgag 

tctcagotacctaagaaaaacctggatgtcactggccactgaggagctttgtttcaaccaagtcatgtgcatttccacgtcaacagaatt 

gtttattgtgacagttatatctgttgtccctttgaccttgtttcttgaaggtttcctcgtccctgggcaattccgcatttaattcatggtattcag 
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gattacatgcatgtttggttaaacccatgagattcattcagttaaaaatccagatggcaaatgaccagcagattcaaatctatggtggttt 
gacctttagagagttgctttacgtggcctgtttcaacacagacccacccagagccctcctgccctccttccgcgggggctttctcatgg 
ctgtccttcagggtcttcctgaaatgcagtggtgcttacgctccaccaagaaagcaggaaacctgtggtatgaagccagacctcccc 
ggcgggcctcagggaacagaatgatcagacctttgaatgattctaatttttaagcaaaatattattttatgaaaggtttacattgtcaaagt 
gatgaatatggaatatccaatcctgtgctgctatcctgccaaaatcattttaatggagtcagtttgcagtatgctccacgtggtaagatcc 
tccaagctgctttagaagtaacaatgaagaacgtggacgcttttaatataaagcctgttttgtcttctgttgttgttcaaacgggattcaca 
gagtatttgaaaaatgtatatatattaagaggtcacgggggctaattgctggctggctgccttttgctgtggggttttgttacctggtttta 
ataacagtaaatgtgcccagcctcttggccccagaactgtacagtattgtggctgcacttgctctaagagtagttgatgttgcattttcct 
tattgttaaaaacatgttagaagcaatgaatgtatataaaagcctcaactagtcatttttttctcctcttcttttttttcattatatctaattattttg 
cagttgggcaacagagaaccatccctattttgtattgaagagggattcacatctgcatcttaactgctctttatgaatgaaaaaacagtc 
ctctgtatgtactcctctttacactggccagggtcagagttaaatagagtatatgcactttccaaattggggacaagggctctaaaaaaa 
gccccaaaaggagaagaacatctgagaacctcctcggccctcccagtccctcgctgcacaaatactccgcaagagaggccagaat 
gacagctgacagggtctatggccatcgggtcgtctccgaagatttggcaggggcagaaaactctggcaggcttaagatttggaata 
aagtcacagaatcaaggaagcacctcaatttagttcaaacaagacgccaacattctctccacagctcacttacctctctgtgttcagat 
gtggccttccatttatatgtgatctttgttttattagtaaatgcttatcatctaaagatgtagctctggcccagtgggaaaaattaggaagtg 
attataaatcgagaggagttataataatcaagattaaatgtaaataatcagggcaatcccaacacatgtctagctttcacctccaggatc 
tattgagtgaacagaattgcaaatagtctctatttgtaattgaacttatcctaaaacaaatagtttataaatgtgaacttaaactctaattaat 
tccaactgtacttttaaggcagtggctgtttttagactttcttatcacttatagttagtaatgtacacctactctatcagagaaaaacaggaa 
aggctcgaaatacaagccattctaaggaaattagggagtcagttgaaattctattctgatcttattctgtggtgtcttttgcagcccagac 
aaatgtggttacacactttttaagaaatacaattctacattgtcaagcttatgaaggttccaatcagatctttattgttattcaatttggatcttt 
cagggattttttttttaaattattatgggacaaaggacatttgttggaggggtgggagggaggaacaatttttaaatataaaacattccca 
agtttggatcagggagttggaagttttcagaataaccagaactaagggtatgaaggacctgtattggggtcgatgtgatgcctctgcg 
aagaaccttgtgtgacaaatgagaaacattttgaagtttgtggtacgacctttagattccagagacatcagcatggctcaaagtgcagc 
tccgtttggcagtgcaatggtataaatttcaagctggatatgtctaatgggtatttaaacaataaatgtgcagttttaactaacaggatattt 
aatgacaaccttctggttggtagggacatctgtttctaaatgtttattatgtacaatacagaaaaaaattttataaaattaagcaatgtgaa 
actgaattggagagtgataatacaagtcctttagtcttacccagtgaatcattctgttccatgtctttggacaaccatgaccttggacaat 
catgaaatatgcatctcactggatgcaaagaaaatcagatggagcatgaatggtactgtaccggttcatctggactgccccagaaaa 
ataacttcaagcaaacatcctatcaacaacaaggttgttctgcataccaagctgagcacagaagatgggaacactggtggaggatg 
gaaaggctcgctcaatcaagaaaattctgagactattaataaataagactgtagtgtagatactgagtaaatccatgcacctaaaccttt 
tggaaaatctgccgtgggccctccagatagctcatttcattaagtttttccctccaaggtagaatttgcaagagtgacagtggattgcatt 
tcttttggggaagctttcttttggtggttttgtttattataccttcttaagttttcaaccaaggtttgcttttgttttgagttactggggttatttttgt 
tttaaataaaaataagtgtacaataagtgtttttgtattgaaagcttttgttatcaagattttcatacttttaccttccatggctctttttaagattg 
atacttttaagaggtggctgatattctgcaacactgtacacataaaaaatacggtaaggatactttacatggttaaggtaaagtaagtctc 
cagttggccaccattagctataatggcactttgtttgtgttgttggaaaaagtcacattgccattaaactttccttgtctgtctagttaatatt 
gtgaagaaaaataaagtacagtgtgagatactg (SEQ ID NO: 56) 
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6.23. Insulin-like Growth Factor Binding Proteui-2 
[0415] See, e.g. , GenBank Accession No. XI 6302. 
General Target Regions: 

[0416] (1) 5' Untranslated Region: 

attcggggcgagggaggaggaagaagcggaggaggcggctcccgctcgcagggccgtgcacctgcccgcccgcccgctcgc 
tcgctcgcccgccgcgccgcgctgccgaccgccagc (SEQ ID NO: 57) 
[0417] (2) 3' Untranslated Region: 

tgatccagggagcccccaccatccggggggaccccgagtgtcatctcttctacaatgagcagcaggaggcttgcggggtgcacac 

ccagcggatgcagtagaccgcagccagccggtgcctggcgcccctgccccccgcccctctccaaacaccggcagaaaacgga 

gagtgcttgggtggtgggtgctggaggattttccagttctgacacacgtatttatatttggaaagagaccagcaccgagctcggcacc 

tccccggcctctctcttcccagctgcagatgccacacctgctccttcttgctttccccgggggaggaagggggttgtggtcggggag 

ctggggtacaggtttggggagggggaagagaaatttttatttttgaacccctgtgtcccttttgcatjiagattaaaggaaggaaaagt 

(SEQ ID NO: 58) 

6.24. K-ras Oncogene Protein 

[0418] See, e.g. , GenBank Accession No. M54968. 

General Target Regions: 

[0419] (1) 5' Untranslated Region: 

tcctaggcggcggccgcggcggcggaggcagcagcggcggcggcagtggcggcggcgaaggtggcggcggctcggccagt 
actcccggcccccgccatttcggactgggagcgagcgcggcgcaggcactgaaggcggcggcggggccagaggctcagcgg 
ctcccaggtgcgggagagaggcctgctgaaa (SEQ ED NO: 59) 
[0420] (2) 3' Untranslated Region: 

taaatacaatttgtacttttttcttaaggcatactagtacaagtggtaatttttgtacattacactaaattattagcatttgttttagcattaccta 

atttttttcctgctccatgcagactg^tagcttttaccttaaatgcttattttaaaatgacagtggaagtttttttttcctcgaagtgccagtattc 

ccagagttttggtttttgaactagcaatgcctgtgaaaaagaaactgaatacctaagatttctgtcttggggtttttggtgcatgcagttga 

ttacttcttatttttcttaccaagtgtgaatgttggtgtgaaacaaattaatgaagcttttgaatcatccctattctgtgttttatctagtcacata 

aatggattaattactaatttcagttgagaccttctaattggtttttactgaaacattgagggacacaaatttatgggcttcctgatgatgattc 

ttctaggcatcatgtcctatagtttgtcatccctgatgaatgtaaagttacactgttcacaaaggttttgtctcctttccactgctattagtcat 

ggtcactctccccaaaatattatattttttctataaaaagaaaaaaatggaaaaaaattacaaggcaatggaaactattataaggccattt 

ccttttcacattagataaattactataaagactcctaatagctttttcctgttaaggcagacccagtatgaatgggattattatagcaaccat 

tttggggctatatttacatgctactaaatttttataataattgaaaagattttaacaagtataaaaaaattctcataggaattaaatgtagtctc 

cctgtgtcagactgctctttcatagtataactttaaatcttttcttcaacttgagtctttgaagatagttttaattctgcttgtgacattaaaaga 

ttatttgggccagttatagcttattaggtgttgaagagaccaaggttgcaagccaggccctgtgtgaaccttgagctttcatagagagtt 

tcacagcatggactgtgtgccccacggtcatccgagtggttgtacgatgcattggttagtcaaaaatggggagggactagggcagtt 

tggatagctcaacaagatacaatctcactctgtggtggtcctgctgacaaatcaagagcattgcttttgtttcttaagaaaacaaactcttt 

tttaaaaattacttttaaatattaactcaaaagttgagattttggggtggtggtgtgccaagacattaattttttttttaaacaatgaagtgaa 
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aaagttttacaatctctaggtttggctagttctcttaacactggttaaattaacattgcataaacacttttcaagtctgatccatatttaataat 
gctttaaaataaaaataaaaacaatccttttgataaatttaaaatgttacttattttaaaataaatgaagtgagatggcatggtgaggtgaa 
agtatcactggactaggttgttggtgacttaggttctagataggtgtcttttaggactctgattttgaggacatcacttactatccatttcttc 
atgttaaaagaagtcatctcaaactcttagttttttttttttacactatgtgatttatattccatttacataaggatacacttatttgtcaagctca 
gcacaatctgtaaatttttaacctatgttacaccatcttcagtgccagtcttgggcaaaattgtgcaagaggtgaagtttatatttgaatatc 
cattctcgttttaggactcttcttccatattagtgtcatcttgcctccctaccttccacatgccccatgacttgatgcagttttaatacttgtaa 
ttcccctaaccataagatttactgctgctgtggatatctccatgaagttttcccactgagtcacatcagaaatgccctacatcttattttcct 
cagggctcaagagaatctgacagataccataaagggatttgacctaatcactaattttcaggtggtggctgatgctttgaacatctcttt 
gctgcccaatccattagcgacagtaggatttttcaaccctggtatgaatagacagaaccctatccagtggaaggagaatttaataaag 
atagtgcagaaagaattccttaggtaatctataactaggactactcctggtaacagtaatacattccattgttttagtaaccagaaatcttc 
atgcaatgaaaaatactttaattcatgaagcttacttttttttttttggtgtcagagtctcgctcttgtcacccaggctggaatgcagtggcg 
ccatctcagctcactgcaaccttccatcttcccaggttcaagcgattctcgtgcctcggcctcctgagtagctgggattacaggcgtgt 
gcactacactcaactaatttttgtatttttaggagagacggggtttcacctgttggccaggctggtctcgaactcctgacctcaagtgatt 
cacccaccttggcctcataaacctgttttgcagaactcatttattcagcaaatatttattgagtgcctaccagatgccagtcaccgcaca 
aggcactgggtatatggtatccccaaacaagagacataatcccggtccttaggtactgctagtgtggtctgtaatatcttactaaggcc 
tttggtatacgacccagagataacacgatgcgtattttagttttgcaaagaaggggtttggtctctgtgccagctctataattgttttgcta 
cgattccactgaaactcttcgatcaagctactttatgtaaatcacttcattgttttaaaggaataaacttgattatattgtttttttatttggcata 
actgtgattcttttaggacaattactgtacacattaaggtgtatgtcagatattcatattgacccaaatgtgtaatattccagttttctctgcat 
aagtaattaaaatatacttaaaaattaatagttttatctgggtacaaataaacagtgcctgaactagttcacagacaagggaaacttctat 
gtaaaaatcactatgatttctgaattgctatgtgaaactacagatctttggaacactgtttaggtagggtgttaagacttgacacagtacct 
cgtttctacacagagaaagaaatggccatacttcaggaactgcagtgcttatgaggggatatttaggcctcttgaatttttgatgtagat 
gggcatttttttaaggtagtggttaattacctttatgtgaactttgaatggtttaacaaaagatttgtttttgtagagattttaaagggggaga 
attctagaaataaatgttacctaattattacagccttaaagacaaaaatccttgttgaagtttttttaaaaaaagactaaattacatagactta 
ggcattaacatgtttgtggaagaatatagcagacgtatattgtatcatttgagtgaatgttcccaagtaggcattctaggctctatttaact 
gagtcacactgcataggaatttagaacctaacttttataggttatcaaaactgttgtcaccattgcacaattttgtcctaatatatacataga 
aactttgtggggcatgttaagttacagtttgcacaagttcatctcatttgtattccattgatttttttttttcttctaaacattttttcttcaaaaca 
gtatatataactttttttaggggattttttttagacagcaaaaaactatctgaagatttccatttgtcaaaaagtaatgatttcttgataattgtg 
tagtgaatgttttttagaacccagcagttaccttgaaagctgaatttatatttagtaacttctgtgttaatactggatagcatgaattctgcat 
tgagaaactgaatagctgtcataaaatgctttctttcctaaagaaagatactcacatgagttcttgaagaatagtcataactagattaaga 
tctgtgttttagtttaatagtttgaagtgcctgtttgggataatgataggtaatttagatgaatttaggggaaaaaaaagttatctgcagttat 
gttgagggcccatctctccccccacacccccacagagctaactgggttacagtgtttta.tccgaaagtttccaattcc (SEQ ID 
NO: 60) 
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6.25. Target of Antiproliferative Antibody 

[0421] See, e.g., GenBank Accession No. M33680 and TrifiUis et al., 1999, RNA 

5:1071-1082. 

General Target Regions: 

[0422] (1) 5' Untranslated Region: 

ccattgtgct ggaaaggcgc gcaacggcgg cgacggcggc gaccccaccg cgcatcctgc caggcctccg 
cgcccagccg cccacgcgcc cccgcgcccc gcgccccgac cctttcttcg cgcccccgcc cctcggcccg 
ccaggccccc ttgccggcca cccgccaggc cccgcgccgg cccgcccgcc gcccaggacc ggcccgcgcc 
ccgcaggccg cccgccgccc gcgccgcc (SEQ ID NO: 61) 
[0423] (2) 3' Untranslated Region: 

g gccccgcagc tctggccaca gggacctctg cagtgccccc taagtgaccc ggacacttcc gagggggcca 
tcaccgcctg tgtatataac gtttccggta ttactctgct acacgtagcc tttttacttt tggggttttg tttttgttct gaactttcct 
gttacctttt cagggctgat gtcacatgta ggtggcgtgt atgagtggag acgggcctgg gtcttgggga ctggagggca 
ggggtccttc tgcccctggg gtcccagggt gctctgcctg ctcagccagg cctctcctgg gagccactcg cccagagact 
cagcttggcc aacttggggg gctgtgtcca cccagcccgc ccgtcctgtg ggctgcacag ctcaccttgt tccctcctgc 
cccggttcga gagccgagtc tgtgggcact ctctgccttc atgcacctgt cctttctaac acgtcgcctt caactgtaat 
cacaacatcc tgactccgtc atttaataaa gaaggaacat caggcatgct aaaaaaaaaa aaaaaa (SEQ ID NO: 62) 

6.26. Downstream Regulatory Element-antagonist Modulator 
[0424] See, e.g., GenBank Accession No. AJ131730. 

General Target Regions: 

[0425] (1) 5' Untranslated Region: 

gaattccggc aaacatgagg cagctgccag ccggcctggg cagtcttgtc tgcctcggct gtgaagtggg gaggctggca 
acagttttct tcagcgccca gg (SEQ ID NO: 63) 
[0426] (2) 3' Untranslated Region: 

gacacgt ccaaaggagt gcatggccac agccacctcc acccccaaga aacctccatc ctgccaggag cagcctccaa 

gaaactttta aaaaatagat ttgcaaaaag tgaacagatt gctacacaca cacacacaca cacacacaca cacacacaca 

gccattcatc tgggctggca gaggggacag agttcaggga ggggctgagt ctggctaggg gccgagtcca 

gaggccccag ccagcccttc ccaggccagc gaggcgaggc tgcctctggg tgagtggctg acagagcagg 

tctgcaggcc accagctgct ggatgtcacc aagaaggggc tcgagtgccc tgcaggaggg tccaatcctc cggtcccacc 

tcgtcccgtt catccattct gctttcttgc cacacagtgg ccggcccagg ctcccctggt ctcctccccg tagccactct 

ctgcccacta cctatgcttc tagaaagccc ctcacctcag gaccccagag gaccagctgg ggggcagggg 

ggagaggggg taatggaggc caagcctgca gctttctgga aattcttccc tgggggtccc agtatcccct gctactccac 

tgacctggaa gagctgggta ccaggccacc cactgtgggg caagcctgag tggtgagggg ccactggcat cattctccct 

ccatggcagg aaggcggggg atttcaagtt tagggattgg gtcgtggtgg agaatctgag ggcactctgc cagctccaca 

ggtggatgag cctctccttg ccccagtcct ggttcagtgg gaatgcagtg ggtggggctg tacacaccct ccagcacaga 
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ctgttccctc caaggtcctc ttaggtcccg gggaggaacg tggttcagag actggcagcc agggagcccg gggcagagct 
cagaggagtc tgggaagggg cgtgtccctc ctcttcctgt agtgcccctc ccatggccca gcagcttggc tgagcccctc 
tcctgaagca gctgtgcgcc gtccctctgc cttgcacaaa aagcacaaga cattccttag cagctcagcg cagccctagt 
gggagcccag cacactgctt ctcggaggcc aggccctcct gctggctgag cttgggcccg gtggccccaa tatggtggcc 
ctggggaaga ggccttgggg gtctgctctg tgcctgggat cagtggggcc ccaaagccca gcccggctga ccaacattca 
aaagcacaaa ccctggggac tctgcttggc tgtcccctcc atctggggat ggagaatgca gcccaaagct ggagccaatg 
gtgagggctg agagggctgt ggctgggtgg tcagcagaaa ccccaggagg agagagatgc tgctcccgcc 
tgattggggc ctcacccaga aggaacccgg tcccagccgc atggcccctc caggaacatt cccacataat acattccatc 
acagccagcc cagctccact cagggctggc ccggggagtc cccgtgtgcc ccaagaggct agccccaggg 
tgagcagggc cctcagagga aaggcagtat ggcggaggcc atgggggccc ctcggcattc acacacagcc 
tggcctcccc tgcggagctg catggacgcc tggctccagg ctccaggctg actggggcct ctgcctccag gagggcatca 
gctttccctg gctcagggat cttctccctc ccctcacccg ctgcccagcc ctcccagctg atgtcactct gcctctaagc 
caaggcctca ggagagcatc accaccacac cctgcggcct tgccttgggg ccagactggc tgcacagccc 
aaccaggagg ggtctgcctc ccacgctggg acacagaccg gccgcatgtc tgcatggcag aagcgtctcc cttgccacgg 
cctgggaggg tggttcctgt tctcagcatc cactaatatt cagtcctgta tattttaata aaataaactt gacaaaggaa 
aaaaaaaccg (SEQ ID NO: 64) 

6.27. 6.27. Cox2 
[0427] See, e.g., GenBank Accession No. M90100. 
General Target Regions: 
[0428] (1) 5' Untranslated Region: 

gtccaggaac tcctcagcag cgcctccttc agctccacag ccagacgccc tcagacagca aagcctaccc ccgcgccgcg 
ccctgcccgc cgctgcg (SEQ ID NO: 65) 
[0429] (2) 3' Untranslated Region: 

aagtctaa tgatcatatt tatttattta tatgaaccat gtctattaat ttaattattt aataatattt atattaaact ccttatgtta 

cttaacatct tctgtaacag aagtcagtac tcctgttgcg gagaaaggag tcatacttgt gaagactttt atgtcactac 

tctaaagatt ttgctgttgc tgttaagttt ggaaaacagt ttttattctg ttttataaac cagagagaaa tgagttttga cgtcttttta 

cttgaatttc aacttatatt ataaggacga aagtaaagat gtttgaatac ttaaacacta tcacaagatg ccaaaatgct 

gaaagttttt acactgtcga tgtttccaat gcatcttcca tgatgcatta gaagtaacta atgtttgaaa ttttaaagta 

cttttgggta tttttctgtc atcaaacaaa acaggtatca gtgcattatt aaatgaatat ttaaattaga cattaccagt aatttcatgt 

ctacttttta aaatcagcaa tgaaacaata atttgaaatt tctaaattca tagggtagaa tcacctgtaa aagcttgttt 

gatttcttaa agttattaaa cttgtacata taccaaaaag aagctgtctt ggatttaaat ctgtaaaatc agatgaaatt 

ttactacaat tgcttgttaa aatattttat aagtgatgtt cctttttcac caagagtata aaccttttta gtgtgactgt taaaacttcc 

ttttaaatca aaatgccaaa tttattaagg tggtggagcc actgcagtgt tatctcaaaa taagaatatc ctgttgagat 

attccagaat ctgtttatat ggctggtaac atgtaaaaac cccataaccc cgccaaaagg ggtcctaccc ttgaacataa 

agcaataacc aaaggagaaa agcccaaatt attggttcca aatttagggt ttaaactttt tgaagcaaac ttttttttag 
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ccttgtgcac tgcagacctg gtactcagat tttgctatga ggttaatgaa gtaccaagct gtgcttgaat aacgatatgt 
tttctcagat tttctgttgt acagtttaat ttagcagtcc atatcacatt gcaaaagtag caatgacctc ataaaatacc 
tcttcaaaat gcttaaattc atttcacaca ttaattttat ctcagtcttg aagccaattc agtaggtgca ttggaatcaa 
gcctggctac ctgcatgctg ttccttttct tttcttcttt tagccatttt gctaagagac acagtcttct caaacacttc gtttctccta 
ttttgtttta ctagttttaa gatcagagtt cactttcttt ggactctgcc tatattttct tacctgaact tttgcaagtt ttcaggtaaa 
cctcagctca ggactgctat ttagctcctc ttaagaagat taaaaaaaaa aaaaaa (SEQ ID NO: 66) 

6.28. Her-2 
[0430] General Target Regions: 
[0431] (1) 5' Untranslated Region: 

gcgcccggcccccacccctcgcagcaccccgcgccccgcgccctcccagccgggtccagccggagccatggggccggagcc 
gcagtgagcaccatggag (SEQ ID NO: 67) 
[0432] (2) 3' Untranslated Region: 

tgaaccagaaggccaagtccgcagaagccctgatgtgtcctcagggagcagggaaggcctgacttctgctggcatcaagaggtg 

ggagggccctccgaccacttccaggggaacctgccatgccaggaacctgtcctaaggaaccttccttcctgcttgagttcccagatg 

gctggaaggggtccagcctcgttggaagaggaacagcactggggagtctttgtggattctgaggccctgcccaatgagactctagg 

gtccagtggatgccacagcccagcttggccctttccttccagatcctgggtactgaaagccttagggaagctggcctgagagggga 

agcggccctaagggagtgtctaagaacaaaagcgacccattcagagactgtccctgaaacctagtactgccccccatgaggaagg 

aacagcaatggtgtcagtatccaggctttgtacagagtgcttttctgtttagtttttactttttttgttttgtttttttaaagacgaaataaagac 

ccaggggagaatgggtgttgtatggggaggcaagtgtggggggtccttctccacacccactttgtccatttgcaaatatattttggaaa 

ac (SEQ ID NO: 68) 

7. EXAMPLE: VASCULAR ENDOTHELIAL GROWTH FACTOR 

7.1. Introduction 
[0433] Vascular endothelial growth factor (VEGF) plays a key role in tumor 
angiogenesis. Considerable evidence demonstrates that VEGF is a viable target for tumor 
therapy (Caimeliet & Jain, 2000, Nature 407:249-257; Sepp-Lorenzino & Pan, 2000, 
Angiogenesis - Research frontiers. A basic science conference of the New York Academy 
of Medicine. Exp. Opin. Invest. Drugs 9:1-7; andHichlin et al, 2001, DDT 6: 517-528). 
There are several ongoing clinical trials (phase I-phase ID) indicating that either VEGF 
neutralizing antibodies or VEGFR2-mediated signal transduction inhibitors are effective for 
tumor therapy (Canneliet & Jain, 2000, Nature 407:249-257 and Matter, 2001, Drug 
Discovery Today 6:1005-1024). 

[0434] VEGF protein expression is tightly regulated at both the transcriptional and post- 
transcriptional levels. Under hypoxic conditions, tumor cells express high levels of VEGF 
that can promote angiogenesis and thus support the growth of tumor cells. Increase of 
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VEGF protein is due to both increased transcription and enhanced mRNA stability. 
Hypoxia-inducible factor 1 (HIF-1) is responsible for the transcriptional activation of the 
VEGF gene in hypoxic cells by binding to a hypoxia response element (HRE) located Ikb 
upstream of the transcription initiation site. In addition, the abimdance of VEGF mRNA is 
increased due to stabilization of the mRNA by binding of HuR to the 3' UTR (untranslated 
region). Under hypoxic conditions, cap-dependent translation is replaced by cap- 
independent translation of the VEGF mRNA which is mediated by an internal ribosome 
entry site (IRES) within the VEGF 5 'UTR. 

[0435] This Example demonstrates the generation of stable cell lines, harboring VEGF 
5' and 3 'UTR sequences, which can be used to identify small molecular weight compounds 
that inhibit VEGF IRES-dependent translation or modulate VEGF mRNA stability. 

7.2. Materials and Methods 

7.2.1. Generation of VEGF 5' and 3'UTRs 

[0436] The VEGF 5 'UTR was generated using PGR from human genomic DNA. The 
fhll-length 5 'UTR was prepared by the ligation of two separate PGR products (FIG. lA). 
The first half of the 5 'UTR (designated VEGF 5'UTR2, encompassing nucleotides 1 to 
498) was amplified with primer 1 (5 '-AAA GTG GAG GTA ATG GCG GAG GCT TGG 
GGC AGG GGG-3', SEQ ID NO: 69, and primer 2 (5' TTT GGG AGT GGT GAG GTG 
CGG GAT GGG AAG 3', SEQ ID NO: 70). The second half of the VEGF 5'UTR 
(designated VEGF 5'UTRl, from nucleotide 337 to 1038, plus the first 45 bp of the VEGF 
open reading frame) was ampUfied using primer 3 (5'-AA GTC GAG GTA AGA GCT 
CGA GAG AGA AGT GGA G-3, SEQ ID NO: 71 and primer 4 (5'-AAA CGG GGG GAG 
CAA GGG AAG GCT CGA ATG GAG-3', SEQ ID NO: 72). Each PGR product was 
digested with BamH I, and ligated together to produce the full length 5'UTR. To facilitate 
downstream cloning into dicistronic plasmid p21uc-i, primers 1 and 3 were designed to 
include a Sail site and a stop code (TAA), immediately after the Sal I site at the 5' end, and 
primer 4 for VEGF 5'UTRl includes a Xma I site at the 5' end (FIG. IC). 
[0437] The entire VEGF 3 ' UTR (shown in FIG. IB) was amplified by genomic PGR 
using primer 5 (5'-GCC GGG CAG GAG GAA GGA GCC TCC GTC AGG GTT TCG 
GGA 3', SEQ ID NO: 73) and primer 6 (5'-GTG CAC TAG AGA CAA AGA CG T GAT 
GTT AAT -3 ', SEQ ID NO: 74. The Bgl 11 and EcoR I restiiction sites were used for 
subsequent cloning. 
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7.2.2. Plasmid Construction 

[0438] Each PGR fragment (VEGF 5 'UTRl , VEGF 5 'UTE12 and VEGF 3 'UTR) was 
cloned into pT-Adv vector for confirmation by DNA sequencing using the Clontech 
advantage clonmg kit. A Sall-Xmal VEGF 5 'UTRl fragment was subcloned into the 
p21uc-i dicisfronic plasmid (FIG. 2A, Grentzmann et al., 1998, RNA 4:479-486). The 
sequence of the polylinker site is GAA CAA ATG TCG ACG GGG GCC CCT AGC AGA 
TCT AGC GCT GGA TCC CCC GGG GAG CTC AUG GAA GAC (SEQ ID NO: 75, FIG. 
2A). The resulting plasmid (designated p21ucA^EGF5UTRl, see FIG. 2B) contains VEGF 
5 'UTRl between the two reporter genes (renilla luciferase and firefly luciferase ) with a 
stop code (TAA) immediately after the Sal I site and a frision translation junction between 
the first 15 AA of VEGF and firefly luciferase open reading frame. To construct the 
dicistronic plasmid containing the frill length VEGF 5 'UTR, VEGF 5'UTR2 was then 
subcloned into p21uc/VEGF5UTRl between Sail and BamHI (designated 
p21ucA/'EGF5UTR-fl; FIG. 2B). This plasmid also has a stop code (TAA) iromediately 
after the Sal I site to prevent read-through from the first reporter to the second. 
[0439] To map the region of the IRES essential for activity, dicistronic plasmids 
containing various deletions within the VEGF 5 'UTR were prepared (FIG. 2B). Plasmid 
p21uc/vegf5'utr-delta5 1-476 is derived from p21uc/vegf5'utr-fl by removing the Nhe I 
fragment (nt51 to 746); plasmid p21uc/vegf5utr-delta476- 103 8 was derived from 
p21uc/vegf5utr-fl by removing the sequence from BamH I site to the 3'end of 5'UTR; 
plasmid p21uc/vegf5utr-delta 1-476 was derived from p21uc/vegf5utr-fl by removing the 
sequence from BamH I to the 5 'end of 5'UTR. 

[0440] To generate stable cell lines for high tliroughput screening, a monocistronic 
reporter plasmid (plucA^EGF5'+3'UTR) containing the VEGF 5' and 3'UTRs and firefly 
luciferase gene (FIG. 3A) was constructed. Briefly, a Sal I-Not I fragment, containing the 
full length VEGF 5'UTR and firefly reporter gene, from p21ucA^EGF5'UTR-fl was 
subcloned into pCDNA5/T0 between EcoR V and Not I, and then the VEGF 3 'UTR was 
subcloned into the intermediate plasmid at the Not I site by blunt-end ligation. 

7.2.3, DNA Transfection and Generation of Stable Cell Lines 

[0441] 293T cells were transfected with plucA^EGFS '+3 'UTR using the Eugene 6 

transfection reagent (Roche) according to manufacture's instruction. 48 hours after 

transfection, the cells were lysed and plasmid ftmction was monitored by measuring 

luciferase activity using Promega's luciferase kit according to manufacture's instruction. 

[0442] To generate stable cell lines, plasmid plucA^GF5 '+3 'UTR was transfected into 

293T cells as described. 48 hours after transfection, the cells were trypsinized, resuspended 
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in culture media plus 200 mg/ml hygromycin B, then seeded in 96 well plates at 100 to 500 
cells per well for selection. The media containing hygromycin B was changed every 3 to 4 
days. After 10 to 14 days of selection, hygromycin resistant clones were screened under a 
microscope and wells harboring a single colony were expanded under hygromycin selection 
for fiirther experiments. 

7.2.4. Lnciferase Assay 

[0443] F. luciferase and R. luciferase activities were measured using the Luciferase 
reporter assay system (Promega) according to manufacturer's instruction. 

7.2.5. Semi-qmantitative PCR 

[0444] DNA and RNA were isolated from B9 cells using TRIzol reagent (GIBCO BRL) 
according to the manufacturer's instructions. cDNA was synthesized using Promega' s 
reverse transcription system. Semi-quantitative PCR was performed with gene specific 
primers for firefly luciferase or glyceraldehyde phosphodehydrogenase (GAPDH) as an 
internal control. The priiiier pairs for firefly luciferase amplification were as follows: 5'- 
CGG TGT TGG GCG CGT TAT TTA TCG GAG TTG-3' (SEQ ID NO: 76) and 5'-TTG 
GCG A AG AAT GAA AAT AGG GTT GGT ACT-3' (SEQ ED NO: 77); the primer pairs 
for GAPDH were as follows: 5'-GGT GAA GGT CGG AGT CAA CGG A-3' (SEQ ID 
NO: 78) and 5'-GAG GGA TCT CGC TCC TGG AAG A-3 ' (SEQ ID NO: 79). The PCR 
products were separated on 1% agarose gel, stained with ethidium bromide and quantified 
on UVP with Labworks software. 

7.2.6. High Throughput Screening 

[0445] High throughput screening ("HTS") for compounds that inhibit untranslated 
region-dependent expression of vascular endothelial growth factor ("VEGF") is 
accomplished using stable cell lines described in Section 7.2.3. The 293T cell line 
contained stably integrated copies of the firefly luciferase gene flanked by both the 5' and 
3' UTRs of VEGF. Cell lines exhibiting consistently high levels of firefly luciferase 
expression are fiurther expanded and optimized for HTS. 

[0446] Screening of compounds is done using one hundred 384-well plates per day. 

Each 384-well plate contains a standard puromycin titration curve that is used as a reference 

to calculate % inhibition and the statistical significance of the data points generated in the 

assay. This curve occurs in wells from column 3 and 4 of the 384-weU plate. The 

concentration of puromycin is 20 mM serially diluted 2-fold to 0.078 mM plated in 

quadruplicate. Columns 1 and 2 contain 16 standards each of a positive control 0.5% 

DMSO and a negative control consisting of the pmromycin at 20 mM. The difference 
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between the two controls is used as the window to calculate the percentage of inhibition of 
luciferase expression in the presence of a compound. Columns 5 through 24 contain 
compounds from a library of small molecules. 

[0447] Two confluent T175 flasks of the VEGF stable cell line described above (B9) 
are split into twenty T175 flasks three days prior to screening. On each day of the HTS 
assay, the cells are dislodged from the flask with 3 ml of 0.25 % trypsin-EDTA (Gibco, cat 
no. 25200-056) and diluted to 10 ml with non-selective media. This is repeated for all 
twenty flasks and the cells are combined, counted and diluted to a concentration of 263.15 
cells/ml. 38 to 39 ml are then added to each well containing 1 to 2 ml of compound from a 
small molecule Ubrary to a final compound concentration of 7.5 mM (3.75 mg/ml) in 0.5 % 
DMSO. The puromycin standard curve also contains 0.5 % DMSO. The stable cell line is 
incubated in the presence of compound overnight (approximately 16 hours) at 37 C in 5 % 
CO2. To monitor firefly luciferase activity, LucLite Plus (Packard cat no. 6016969) is 
prepared according to manufactures' instructions and 20 ml is added to each well. 
Following a brief incubation at room temperature (minimum 2 min.), firefly luciferase 
activity in each well is detected with the ViewLux 1430 ultraHTS Microplate hnager 
(Perkin Elmer). All data obtained is uploaded into Activity Base for % inhibition 
calculations and statistical analyses. 

7.3. Results 

[0448] The ability of VEGF 5 'UTR sequences to modulate internal translation initiation 
was tested using the plasmid vector that encodes a dicistronic mRNA (FIG. 2A). The 
renilla luciferase is translated from the first cistron by a cap-dependent scanning 
mechanism, while the firefly luciferase in the second cistron is translated only if preceded 
by an internal ribosome entry site. In this study, five discistronic plasmids containing 
various deletions of the VEGF 5 'UTR (FIG. 2B) were generated and transiently transfected 
into 293T cells to monitor IRES-dependent translation of firefly luciferase. 48 hours after 
transfection, extracts were prepared and assayed for renilla and firefly luciferase activities 
using the dual luciferase kit from Promega. As shown in FIG. 2C, deletion of either the 
first 336 or the first 476 nucleotides has no significant effect on firefly luciferase activity 
compared to full length VEGFS'UTR directed luciferase levels. However, deletion 
between nucleotides 51 and 746 decreased firefly luciferase activity more than 75% 
(33.68+/-4.91 vs 161+/-30.49). Deletion of nucleotides 476 to the 3' end of the VEGF 
5'UTR decreased firefly luciferase activity more than 90% (12.15+/-1.2 v.s. 161+/-30.49). 
Taken together, these results confirm that the VEGF 5'UTR harbors IRES activity, and also 
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indicates that the region of the VEGF IRES essential for function is located within 
nucleotide 476 to the 3' end of VEGF 5'UTR. 

[0449] To generate stable cell hnes for ffigh Throughput Screening ("HTS"), a 
monocistronic reporter plasmid under the transcriptional control of the CMV promoter 
(pluc/vegf5'+3'UTR; FIG. 3A) was constructed. This plasmid contains both the VEGF 5'- 
aud 3'-UTRs separated by the firefly luciferase gene. After confirmation of luciferase 
production by transient transfection (data not shown), transfected 293 -T cells were seeded 
in 96 well plates at a concentration of 100-500 cells per well, and then cultured under 
hygomycin B selection. After two weeks of selection, 19 clones were screened for 
luciferase activity, tliree of which demonstrated liigh levels of luciferase activities (clones 
B9, D3, H6; FIG. 3B). To determine which cell line demonstrated the highest level of 
expression, the luciferase activities of clones B9, D3, H6 were compared and normalized 
against the protein concentrations extracted from each cell line. The results shown in FIG. 
4 demonstrate that the luciferase activity from B9 cells was two fold greater than H6 cells, 
and more than three fold higher than D3 cells. 

[0450] To determine if the B9 cells are stable, these cells were maintained under 
hygromycin selection for more than three months, with intermittent monitoring of luciferase 
activity. The results indicate that this cell line is stable and sustains a high level of 
luciferase expression when continuously cultured in vitro for more than tliree months (FIG. 
5). Sustained expression of luciferase by B9 cells indicated that the monocistronic plasmid 
integrated into the genomic DNA. Semi-quantitative PGR was performed to determine the 
number of copies of the reporter plasmid integi-ated per B9 cell. As FIG. 6A shows, series 
diluted plasmid pluc5'+3'vegf-UTR were mcluded as positive control to make sure the 
reaction for sample (genomic DNA from B9 cells) was within the linear range, i.e., not 
saturated. The PGR standard curve was plotted with the PGR product intensity against the 
amount of positive plasmid control loaded for PGR (FIG. 6B). Sigma plot regression 
indicated that PGR product intensity for B9 genomic DNA (50 ng) is about the same level 
of 6.4 pg plasmid control. As 1 mg of 8 kb plasmid roughly contains lO" copies and 10^ 
cells have 10 mg genomic DNA, the results here indicated that approximately 100 copies of 
the plasmid were integrated per cell. 

[0451] High throughput screening ("HTS") for compounds that inhibit untranslated 
region-dependent expression of vascular endothelial growth factor ("VEGF") is 
accomphshed with the generated stable cell lines. 
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8. EXAMPLE: SURVIVIN 

8.1. Introduction 

[0452] Survivin, a member of lAP (inhibitor of apoptosis proteins) gene family, is 
critically required for suppression of apoptosis and ensuring cell division through the G2/M 
phase of the cell cycle. It is absent in normal adult tissues, but highly expressed in all of the 
most common cancers in a cell cycle-regulated manner. Disruption of survivin 
expression/function by antisense or dominant-negative mutation resulted in deregulation of 
mitotic progression and spontaneous apoptosis. It has been demonstrated that survivin 
targeting in vivo increased apoptosis and reduced proUferation of tumor cells, but did not 
affect cell viability of proliferating normal cells. It has also been showed that survivin 
targeting induces apoptosis and sensitizes tumor cells to chemical agents. Therefore, 
inhibition of survivin may be of great benefit for refiractory cancer therapy when combined 
with standard chemotherapy. Another benefit for this project will be low toxicity because 
survivin expression is absent in normal adult tissues. Taken together, these indicated 
survivin is valid target for cancer therapy. 

[0453] Translation of survivin might be cap-independent/IRES dependent since it is 
maximally expressed in metaphase. In G2/M phase, the eIF4E-binding proteins (4E-BPs) 
become hypophosphorylated. 4E-BPs compete with eIF4G for eIF4E binding, thus 
preventing eIF4F formation and cap-dependent translation initiation. In addition, survivin 
mRNA has a long 3 'UTR featuring a poly(U) sequence and multiple CU repeats. 
[0454] This Example demonstrates the generation of stable cell lines, harboring 
survivin 5' and 3 'UTR sequences, to identify small molecular weight compounds that 
inhibit survivin IRES-dependent translation or modulate survivin mRNA stability. 

8.2. Materials and Methods 

8.2.1. Generation of Survivin 5' and 3' UTRs 
[0455] The 5' UTR of survivin was generated by filling-in partially overlapping 
oligonucleotides with Taq polymerase. A 5' UTR forward oligonucleotide (5' 
AAAGTCGACGTAACCGCCAGATTTGAATCGCGGGACCCGTTGGCAGAGGTGGC 
GG 3', SEQ ID NO: 80) encompassing nucleotides 1 to 42 of the 5' UTR of survivin and a 
5' UTR reverse oligonucleotide (5' 

AAAGGATCCGGGCAACGTCGGGGCACCCATGCCGCCGCCGCCACCTCTGCCAA 
C 3', SEQ ID NO: 81) encompassing nucleotides 26 to 49 of the 5' UTR of survivin as well 
as the first 21 nucleotides of the open reading frame of survivin were annealed at 45 C and 
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extended at 72 C with Taq polymerase. The Sal I and BamH I restriction sites (underlined) 
were used for subsequent cloning. 

[0456] The 3 ' UTR of survivin was amplified from human genomic DNA using the 3 ' 

UTR forward oligonucleotide (5' 

AAAGCGGCCGCGGCCTCTGCCGGAGCTGCCTGGTCCCAGA 3', SEQ ID NO: 82) 

and the 3' UTR reverse oligonucleotide (5' 

AAATCTAGACTCAGGAACAGCCGAGATGACCTCCAGA 3', SEQ ID NO: 83). The 
Not I md Xba I restriction sites were used for subsequent cloning. 

8.2.2. Plasmid Constructioii 
[0457] The survivin 3' UTR PGR product generated in Section 8.2.1 was cloned into 
the pT-Adv for sequence verification. A positive clone was subsequently digested with Not 
I and Xba I and the resulting 1.1 kb survivin 3' UTR PGR fiagment was subcloned into 
pcDNA3.1/Hygro (Invitrogen cat. no. V87020) to generate the intermediate plasmid 
Surv3'UTR/pcDNA3.1/Hygro. 

[0458] The survivin 5' UTR DNA fragment generated in Section 8.2.1 was digested 

with Sal I and BamH I and was subcloned into p21uci (see, e.g., Grentzmann et al., 1998, 

RNA 4:479-486) to generate the intermediate plasmid, Surv5'UTR/p21uci, which contains 

the 5' UTR of survivin between the open reading frames of the reniUa and firefly luciferase 

reporter genes. Surv5'UTR/p21uci was then digested with either Sal I and Not I or BamH I 

and Not I to isolate and gel purify the 1 .75 kb survivin 5' UTR-firefly luciferase or the 1 .7 

kb firefly luciferase DNA fragments. The Sal 1 5' overhang of the 1.75 kb survivin 5' 

UTR-firefly luciferase fragment was fiUed-in with T4 DNA polymerase and was subcloned 

into both Surv3'UTR/pcDNA3.1/Hygro and pcDNA3.1/Hygro digested with EcoR V and 

Not I to generate the plasmids, Surv5'UTR-Fluc-Surv3'UTR/pcDNA3.1/Hygro and 

Surv5'UTR-Fluc/pcDNA3.1/Hygro. The former plasmid contains the furefly luciferase 

reporter gene surrounded by both the 5 ' and 3 ' unfranslated regions of survivin while the 

latter plasmid contains the firefly luciferase reporter gene preceded only by the 5' UTR of 

survivin and will be used as a "5' UTR-only" control plasmid in fiiture experiments. The 

1 .7 kb BamH I-Not I firefly luciferase fragment was subcloned into both 

Surv3'UTR/pcDNA3.1/Hygro and pcDNA3.1/Hygro to generate the plasmids, 

Fluc-Surv3 'UTR/pcDNA3 . 1/Hygro and Fluc/pcDNA3 . 1/Hygro. The former plasmid 

contains the firefly luciferase reporter gene followed only by the 3 ' UTR of survivin and 

will be used as a "3 ' UTR-only" control plasmid in fixture experiments. The latter plasmid 

contains only the firefly luciferase reporter gene lacking any surroxmding untranslated 

regions of survivin and will be used in subsequent studies as a "no UTR" control plasmid. 
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[0459] Since the 5 ' UTR of survivin is small (49 nucleotides), it is likely that 
cap-dependent and cap-independent firefly luciferase expression in the survivin expression 
plasmids described above cannot be distinguished. One method of separating 
cap-dependent from cap-independent translation is through the introduction of a stable 
hairpin or secondary structure upstream or near the 5' end of the 5' UTR of the expression 
vector (see, e.g., Muhlrad et al., 1995 Mol. Cell. Biol. 15:2145-2156). Therefore, two 
complementary ohgonucleotides, SL top (5' 

CTAGAAGCTTAGGGCCGCGGATCCGCGCGCGGTTCGCCGCGCGCGGATCCGCG 
GTAGCAAGTTAGTC 3', SEQ ID NO: 84) and SL bottom (5' 

GACTAAGCTTGCTACCGCGGATCCGCGCGCGGCGAACCGCGCGCGGATCCGCG 
GCCCTAAGCTTCTAG 3', SEQ ID NO: 85) were synthesized, digested with Hind III, 
annealed and subcloned into the survivin expression vectors described above. A stable 
stem-loop structure with an 18 base-pair stem and a UUCG loop sequence will form and 
effectively block cap-dependent translation (see, e.g., Beelman & Parker, 1994 J. Biol. 
Chem. 269:9687-9692 andMuhkad et al., 1995 Mol. Cell. Biol. 15:2145-2156). 

8.2.3. DNA Traiisfection and Stable Cell Line Generation 
[0460] 293T cells were transiently transfected with equal amoxmts of each of the 
survivin expression vectors described in Section 8.2.2 (both with and without each of the 5' 
and 3 ' UTRs of survivin and in the presence and absence of the stem-loop secondary 
structure) using the Fugene 6 transfection reagent (Roche) according to manufacture's 
instruction. Untranslated region-dependent firefly luciferase activity was monitored 
forty-eight hours post-transient transfection according to manufacture's instruction (see, 
e.g.. Section 8.2.4.). 

[0461] To generate stable cell lines, 293T cells were transiently transfected as above. 
Instead of lysing the transiently transfected 293T cells to monitor firefly luciferase activity, 
the cells were trypsinized, counted and seeded (10 ml) in 10 cm petri dishes at a 
concentration of 5000 cells/ml. The following day, hygromycin B was added in culture 
media to a final concentration of 200 mg/ml to select for cells in which the transiently 
transfected plasmid has stably integrated into the genome. Following ten to fourteen days 
of hygromycin B selection, individual hygromycin-resistent clones were expanded by 
transferring the cells firom the petri dish to a single well in a six or twenty-four well plate 
using trypsin-soaked filter discs according to manufacture's instructions. Individual cell 
lines are then selected for further studies based on firefly luciferase expression levels. 
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5.2.4. Luciterase Assays 

[0462] Firefly luciferase activity was measured with the luciferase reporter assay 
system (Promega) accordiag to manufacture's iastructions. 

8.2.5. High Throughput Screening 

[0463] High throughput screening ('THTS") for compounds that inliibit untranslated 
region-dependent expression of survivin is accomplished with stable cell lines described in 
Section 8.2.3. The 293T cell hne contained stably integrated copies of the firefly luciferase 
gene flanked by both the 5' and 3' UTRs of survivin. Cell lines exhibiting consistently high 
levels of firefly luciferase expression are fiirther expanded and optimized for HTS. 
[0464] Screening of compounds is done using one hundred 384-well plates per day. 
Each 384-well plate contains a standard puromycin titration curve that is used as a reference 
to calculate % inhibition and the statistical significance of the data points generated in the 
assay. This curve occurs in wells from colimm 3 and 4 of the 384-well plate. The 
concentration of puromycin is 20 mM serially diluted 2-fold to 0.078 mM plated in 
quadruplicate. Columns 1 and 2 contain 16 standards each of a positive control 0.5% 
DMSO and a negative control consisting of the puromycin at 20 mM. The difference 
between the two controls is used as the window to calculate the percentage of inhibition of 
luciferase expression in the presence of a compound. Columns 5 through 24 contain 
compounds firom a library of small molecules. 

[0465] Two confluent T175 flasks of the survivin stable cell line described above are 
split into twenty T175 flasks three days prior to screening. On each day of the HTS assay, 
the cells are dislodged from the flask with 3 ml of 0.25 % trypsin-EDTA (Gibco, cat no. 
25200-056) and diluted to 10 ml with non-selective media. This is repeated for all twenty 
flasks and the cells are combined, counted and diluted to a concentration of 263.15 cells/ml. 
38 ml are then added to each well containing 2 ml of compound fix)m a small molecule 
library to a final compound concentration of 7.5 mM (3.75 mg/ml) in 0.5 % DMSO. The 
puromycin standard curve also contains 0.5 % DMSO. The stable cell line is incubated in 
the presence of compound overnight (approximately 16 hours) at 37 C in 5 % CO2. To 
monitor firefly luciferase activity, Luclite Plus (Packard cat no. 6016969) is prepared 
according to manufactures' instructions and 20 ml is added to each well. FoUowmg a brief 
incubation at room temperature (nunimum 2 min.), firefly luciferase activity hi each well is 
detected with the ViewLux 1430 ultraHTS Microplate hnager (Perkin Ehner). All data 
obtained is uploaded into Activity Base for % inhibition calculations and statistical 
analyses. 
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».^. Kesmts 

[0466] To determine the effect of the survivin untranslated regions on 
post-transcriptional control of gene expression, transient transfections of the survivin 
expression vectors described in Section 8.1.2., containing both, one or none of the 5' and 3' 
UTRs of survivin both in the absence or presence of the stem-loop secondary structure were 
performed, hi the absence of the stem-loop secondary shucture, cap-dependent and 
cap-independent translation are equally favored and no significant difference in firefly 
luciferase expression could be detected when either or both of the 5' and 3' UTRs are 
present or absent (FIG. 7A). This results confirms the earher notion that, in the survivin 
expression vectors without the stem-loop secondary structure, the 5' UTR of survivm is 
unable to block cap-dependent translation, hi the presence of the stem-loop secondary 
structure, a 3 -fold increase in firefly luciferase expression can be detected only in the 
survivin expression vectors that contain the 5' UTR of survivin (FIG. 7B). This result 
strongly suggests that the 5' UTR of sur\dvin can function as an internal ribosome entry site 
and promote cap-independent translation and helps explain the increase in the endogenous 
levels of survivin in the G2/M phase of the cell cycle when overall translation is 
dramatically reduced. 

[0467] High tliroughput screenmg ("HTS") for compounds that iiiliibit untranslated 
region-dependent expression of survivm is accomphshed with tlie generated stable cell 
lines. 

9. EXAMPLE: HER-2 

[0468] This Example demonstrates the generation of stable cell lines, harboring Her-2 
5' and 3 'UTR sequences, to identify small molecular weight compounds that inhibit Her-2 
5' UTR-dependent translation or modulate Her-2 mRNA stability. 

9.1. Her-2 Constructs 

9.1.1. Generation of Her-2 in vitro Expression Constructs 
[0469] The 99 nucleotide 5' UTR of Her-2 was PCR-ampHfied from a human genomic 
DNA (Promega) using the following primers: Sense/Hindlll: 
CAAGAAGCTTgcgcccggccccccacccctcg (SEQ ID NO: 86) and Antisense/Ncol: 
AGCCCATGGtgctcactgcggctccggcccc (SEQ ID NO: 87). The Advantage-GC2-PCR kit 
was used according to the manufacturer's instructions (Clontech) with the following 
conditions: PGR cycle conditions were 94 C, 3 minutes, followed by 35 cycles of 94 C, 30 
seconds, and 68 C, 30 seconds. The PCR-ampUfied product was cloned using the pT Adv 
kit (Clontech) according to the manufacturer's instructions. All clones were confirmed by 
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sequencing. The resulting clone was digested with Hindlll/Ncol aiid the fragment was 
cloned into pT7Luc, upstream of the luciferase gene, to generate pT7Luc/5'UTR. 
[0470] The 6 1 5 nucleotide 3 'UTR was PCR-amphfied from human genomic DNA 
(Promega) using tlie following primers: sense/Bglll: agactctgaaccagaaggccaa (SEQ ID NO: 
88) and antisense/Kpnl: ctcggtaccagttttccaaaatatatttgcaaatgg (SEQ ID NO: 89). The 
Titanium Taq kit (Clontech) was used according to the manufacturer's instructions with the 
following amplification conditions: 94 C, 1 minute, followed by 35 cycles at 94 C, 30 
seconds to denature, 60 C 30 seconds to anneal, 72 C 1 minute to extend. The product was 
gel purified and cloned using pT Adv (Clontech) according to the manufacturer's 
instructions. Ah clones were sequenced. The resulting clone was digested with Bglll/Kpnl 
and cloned into a Bglll/Kpnl digested pT7Luc and pT7Luc/5'UTR to generate 
pT7Luc/3'UTR and pT7Luc/5'and3'UTR, respectively. 

9.1.2. Generation of Her-2 in vivo Expression Constructs 
[0471] Constructs for cell-based expression were generated by isolation of Her-2 
containing fragments of pT7Luc/5'UTR, pT7Luc/3'UTR and pT7Luc/5' and 3'UTR 
digested with Hindm and Kpnl and cloned into pcDNA (+) (Invitrogen). 

9.1.3. Generation of Her-2 uORF Mutants 

[0472] The uORF contained within the Her-2 5 ' UTR was removed by extending the 
overlapping long primers. The overlapping sequence is underlined. The sense minus uORF 
Hindm primer is: cccaagcttcgcgcccggccccccacccctcgcagcaccccgcgccccgcgccctccc (SEQ 
ID NO: 90) and the antisense minus uORF Ncol primer is: 

ggccccatggctccggctggacccggctgggacccggctgggagggcgcgggagggcgcgg (SEQ ID NO: 91). 
The primers (10 micrograms) were denatured at 95 C for 2 minutes, annealed at 60 C for 5 
minutes and extended at 72 C for 10 mmutes using Taq polymerase (Clontech). After 
buffer-exchange, the product was digested with Ncol and Hindlll and cloned in the 
Hindlll/NcoI sites of the in vitro expression vector pT7Luc and pT7Luc/3'UTR, yielding 
pT7Luc/5 'UTR minus uORF and pT7Luc/5 ' UTR minus uORF and 3 ' UTR. Both plasmids 
were digested with Hindlll and Kpnl and the Her-2 containing fragment was subcloned into 
the Hmdin/Kpnl site of pcDNA (+) (hivifrogen) for cell-based studies. 

9.2. Stable Cell Line Production 
[0473] Stable cell Unas were generated in HeLa, 293T, and MCF-7. First, transient 
transfection was carried out using the Eugene 6 transfection reagent (Roche) according to 
manufacturer's mstructions. Untranslated region-dependent firefly luciferase activity was 
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monitored forty-eight hours post-transient transfection with the luciferase reporter assay 
system (Promega) according to manufacture's instructions. 

[0474] Next, stable cell lines were generated by first transiently transfecting the above 
cell lines. Instead of lysing the transiently transfected 293T cells (or other), the cells were 
trypsinized, coimted and seeded (10 ml) in 10 cm petri dishes at a concentration of 5000 
cells/ml. The following day, hygromycin B was added in culture media to a final 
concentration of 100 mg/ml for 293T cells and 200 mg/ml for MCF-7 and Hela, to select 
for cells in which the transiently- transfected plasmid has stably integrated into the genome. 
Following ten to fourteen days of hygromycin B selection, individual hygromycin-resistent 
clones were expanded by transferring the cells firom the petri dish to a single well in a six or 
twenty- four well plate using trypsin-soaked filter discs according to manufacture's 
instructions. Individual cell lines are then selected for further studies based on firefly 
luciferase expression levels. 

9.3. In y/froHigh Throughput Screen 
[0475] Construct pT7Luc/5 'and 3 'UTR is utilized as a template for large-scale T7 
polymerase transcription according to the manufacturer's protocol (Ambion). The mRNA 
template containing the Her-2 5' and 3' UTR and the Luciferase ORF is uncapped and used 
at 100 nanograms/reaction for a typical in vitro HTS. The number of samples run 
determines the amount of faranscription yield that must be obtained. For example, for 
100,000 reactions, using 100 nanograms of RNA/reaction, 10 milUgrams of RNA must be 
produced. Typical yields fi-om the Ambion T7 Transcription Kit for this template are 5 
mg/ml of transcription. 

[0476] Screening of compounds is done using one hundred 3 84- well plates per day. 
Each 384-well plate contains a standard puromycin titration curve that is used as a reference 
to calculate % inhibition and the statistical significance of the data points generated in the 
assay. This curve occurs in wells firom column 3 and 4 of the 384-well plate. The 
concentration of puromycin is 20 mM serially diluted 2-fold to 0.078 mM plated in 
quadruplicate. Columns 1 and 2 contain 16 standards each of a positive control of 4% 
DMSO and a negative control consisting of the puromycin at 20 mM. The difference 
between the two controls is used as the window to calculate the percentage of inhibition of 
luciferase expression in the presence of a compound. Columns 5 through 24 contain 
compounds from, a library of small molecules. 

[0477] The in vitro translation reaction of the Her-2 driven Luciferase ORF consists of 

four microliters of rabbit reticulocyte lysate (Green Hectares) supplemented with 0.013 

mgs/ml hemin (Sigma), 0.05 mgs/ml creatine kinase (Roche), and 0.125 mgs/ml tRNA 
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(Sigma l ype Ail, rabbit liver), 100 nanograms uncapped mRNA and buffer containing 100 
mM KOAc, 0.5 mM Mg(0Ac)2, 10 mM creatine phosphate, 0.03 mM amino acid mix, in a 
reaction volume of 20 ml. The reaction is then incubated at 30 C for 45 minutes. At the 
end of incubation, 20 mL of LucLite (Packard) is added to the reaction and the light output 
resulting from luciferase catalyzed conversion of luciferin, is monitored on a ViewLux 
uHTS Plate reader (Perkin Ehner). 

10. EXAMPLE: CELL EXPRESSION VECTORS 

[0478] Stable cell line expression vectors (pCMRl and pCMR2) are shown in FIGS. 8A 
and 8C. pCMRl, a high-level stable and transient mammalian expression vector designed 
to randomly integrate into the genome and pCMR2 is an episomal mammalian expression 
vector. pMCPl (FIG. 8B) is a high level stable and transient mammalian expression vector 
designed to site-specifically integrate into the genome of cells genetically engineered to 
contain the FRT site-specific recombination site via the Flp recombinase (see, e.g., Craig, 
1988, Ann. Rev. Genet. 22: 77-105; and Sauer, 1994, Curr. Opin. Biotechnol. 5: 521-527). 
The nucleotide sequences are presented below. 
[0479] pCMRl (SEQ ID NO: 92) 

gacggatcgggagatctcccgatcccctatggtgcactctcagtacaatctgctctgatgccgcatagttaagccagtatctgctccct 

gcttgtgtgttggaggtcgctgagtagtgcgcgagcaaaatttaagctacaacaaggcaaggcttgaccgacaattgcatgaagaat 

ctgcttagggttaggcgttttgcgctgcttcgcgatgtacgggccagatatacgcgttgacattgattattgactagttattaatagtaatc 

aattacggggtcattagttcatagcccatatatggagttccgcgttacataacttacggtaaatggcccgcctggctgaccgcccaac 

gacccccgcccattgacgtcaataatgacgtatgttcccatagtaacgccaatagggactttccattgacgtcaatgggtggagtattt 

acggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgc 

ctggcattatgcccagtacatgaccttatgggactttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcgg 

ttttggcagtacatcaatgggcgtggatagcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttt 

tggcaccaaaatcaacgggactttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtggga 

ggtctatataagcagagctctctggctaactaagctttcggcgcgccgaggtaccatgggatccgaagacgccaaaaacataaaga 

aaggcccggcgccattctatcctctagaggatggaaccgctggagagcaactgcataaggctatgaagagatacgccctggttcct 

ggaacaattgcttttacagatgcacatatcgaggtgaacatcacgtacgcggaatacttcgaaatgtccgttcggttggcagaagctat 

gaaacgatatgggctgaatacaaatcacagaatcgtcgtatgcagtgaaaactctcttcaattctttatgccggtgttgggcgcgttattt 

atcggagttgcagttgcgcccgcgaacgacatttataatgaacgtgaattgctcaacagtatgaacatttcgcagcctaccgtagtgtt 

tgtttccaaaaaggggttgcaaaaaattttgaacgtgcaaaaaaaattaccaataatccagaaaattattatcatggattctaaaacgga 

ttaccagggatttcagtcgatgtacacgttcgtcacatctcatctacctcccggttttaatgaatacgattttgtaccagagtcctttgatcg 

tgacaaaacaattgcactgataatgaattcctctggatctactgggttacctaagggtgtggcccttccgcatagaactgcctgcgtca 

gattctcgcatgccagagatcctatttttggcaatcaaatcattccggatactgcgattttaagtgttgttccattccatcacggttttggaa 

tgtttactacactcggatatttgatatgtggatttcgagtcgtcttaatgtatagatttgaagaagagctgtttttacgatcccttcaggatta 
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caaaattcaaagtgcgttgctagtaccaaccctattttcattcttcgccaaaagcactctgattgacaaatacgatttatctaatttacacg 

aaattgcttctgggggcgcacctctttcgaaagaagtcggggaagcggttgcaaaacgcttccatcttccagggatacgacaaggat 

atgggctcactgagactacatcagctattctgattacacccgagggggatgataaaccgggcgcggtcggtaaagttgttccatttttt 

gaagcgaaggttgtggatctggataccgggaaaacgctgggcgttaatcagagaggcgaattatgtgtcagaggacctatgattat 

gtccggttatgtaaacaatccggaagcgaccaacgccttgattgacaaggatggatggctacattctggagacatagcttactggga 

cgaagacgaacacttcttcatagttgaccgcttgaagtctttaattaaatacaaaggatatcaggtggcccccgctgaattggaatcga 

tattgttacaacaccccaacatcttcgacgcgggcgtggcaggtcttcccgacgatgacgccggtgaacttcccgccgccgttgttgt 

tttggagcacggaaagacgatgacggaaaaagagatcgtggattacgtcgccagtcaagtaacaaccgcgaaaaagttgcgcgg 

aggagttgtgtttgtggacgaagtaccgaaaggtcttaccggaaaactcgacgcaagaaaaatcagagagatcctcataaaggcca 

agaagggcggaaagtccaaattgcgcggccgctaactcgagaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattc 

tattctggggggtggggtggggcaggacagcaagggggaggattgggaagacaatagcaggcatgctggggatgcggtgggct 

ctatggcttctgaggcggaaagaaccagctggggctctagggggtatccccacgcgccctgtagcggcgcattaagcgcggcgg 

gtgtggtggttacgcgcagcgtgaccgctacacttgccagcgccctagcgcccgctcctttcgctttcttcccttcctttctcgccacgt 

tcgccggctttccccgtcaagctctaaatcgggggctccctttagggttccgatttagtgctttacggcacctcgaccccaaaaaactt 

gattagggtgatggttcacgtagtgggccatcgccctgatagacggtttttcgccctttgacgttggagtccacgttctttaatagtgga 

ctcttgttccaaactggaacaacactcaaccctatctcggtctattcttttgatttataagggattttgccgatttcggcctattggttaaaaa 

atgagctgatttaacaaaaatttaacgcgaattaattctgtggaatgtgtgtcagttagggtgtggaaagtccccaggctccccagcag 

gcagaagtatgcaaagcatgcatctcaattagtcagcaaccaggtgtggaaagtccccaggctccccagcaggcagaagtatgca 

aagcatgcatctcaattagtcagcaaccatagtcccgcccctaactccgcccatcccgcccctaactccgcccagttccgcccattct 

ccgccccatggctgactaattttttttatttatgcagaggccgaggccgcctctgcctctgagctattccagaagtagtgaggaggcttt 

tttggaggcctaggcttttgcaaaaagctcccgggagcttgtatatccattttcggatctgatcagcacgtgatgaaaaagcctgaact 

caccgcgacgtctgtcgagaagtttctgatcgaaaagttcgacagcgtctccgacctgatgcagctctcggagggcgaagaatctc 

gtgctttcagcttcgatgtaggagggcgtggatatgtcctgcgggtaaatagctgcgccgatggtttctacaaagatcgttatgtttatc 

ggcactttgcatcggccgcgctcccgattccggaagtgcttgacattggggaattcagcgagagcctgacctattgcatctcccgcc 

gtgcacagggtgtcacgttgcaagacctgcctgaaaccgaactgcccgctgttctgcagccggtcgcggaggccatggatgcgat 

cgctgcggccgatcttagccagacgagcgggttcggcccattcggaccgcaaggaatcggtcaatacactacatggcgtgatttca 

tatgcgcgattgctgatccccatgtgtatcactggcaaactgtgatggacgacaccgtcagtgcgtccgtcgcgcaggctctcgatg 

agctgatgctttgggccgaggactgccccgaagtccggcacctcgtgcacgcggatttcggctccaacaatgtcctgacggacaat 

ggccgcataacagcggtcattgactggagcgaggcgatgttcggggattcccaatacgaggtcgccaacatcttcttctggaggcc 

gtggttggcttgtatggagcagcagacgcgctacttcgagcggaggcatccggagcttgcaggatcgccgcggctccgggcgtat 

atgctccgcattggtcttgaccaactctatcagagcttggttgacggcaatttcgatgatgcagcttgggcgcagggtcgatgcgacg 

caatcgtccgatccggagccgggactgtcgggcgtacacaaatcgcccgcagaagcgcggccgtctggaccgatggctgtgtag 

aagtactcgccgatagtggaaaccgacgccccagcactcgtccgagggcaaaggaatagcacgtgctacgagatttcgattccac 

cgccgccttctatgaaaggttgggcttcggaatcgttttccgggacgccggctggatgatcctccagcgcggggatctcatgctgga 

gttcttcgcccaccccaacttgtttattgcagcttataatggttacaaataaagcaatagcatcacaaatttcacaaataaagcatttttttc 
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actgcattctagttgtggtttgtccaaactcatcaatgtatcttatcatgtctgtataccgtcgacctctagctagagcttggcgtaatcatg 

gtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctggggtg 

cctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatga 

atcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcgg 

ctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgag 

caaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatca 

caaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtg 

cgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgt 

aggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatcc 

ggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgag 

gtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctga 

agccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtttttttgtttgcaagcagca 

gattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaa 

gggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgag 

taaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccc 

cgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctc 

cagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctatta 

attgttgccgggaagctagagtaagtagttcgccagttaatagtltgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgct 

cgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagc 

tccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcat 

gccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgccc 

ggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctc 

aaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctg 

ggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttccttttt 

caatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcg 

cacatttccccgaaaagtgccacctgacgtc 

[0480] pCMIl2 (SEQ JD NO: 93) 

gttgacattgattattgactagttattaatagtaatcaattacggggtcattagttcatagcccatatatggagttccgcgttacataactta 
cggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagtaacgccaata 
gggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtccgcc 
ccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttacgggactttcctacttggcagtacatct 
acgtattagtcatcgctattaccatggtgatgcggttttggcagtacaccaatgggcgtggatagcggtttgactcacggggatttcca 
agtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggactttccaaaatgtcgtaataaccccgccccgtt 
gacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcagagctcgtttagtgaaccgtaagctttcggcgcgccacg 
gtaccatgggatccgaagacgccaaaaacataaagaaaggcccggcgccattctatcctctagaggatggaaccgctggagagc 
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aactgcataaggctatgaagagatacgccctggttcctggaacaattgcttttacagatgcacatatcgaggtgaacatcacgtacgc 

ggaatacttcgaaatgtccgttcggttggcagaagctatgaaacgatatgggctgaatacaaatcacagaatcgtcgtatgcagtgaa 

aactctcttcaattcmatgccgg^gttgggcgcgttatttatcggagttgcagttgcgcccgcgaacgacatttataatgaacgtgaatt 

gctcaacagtatgaacatttcgcagcctaccgtagtgtttgtttccaaaaaggggttgcaaaaaattttgaacgtgcaaaaaaaattacc 

aataatccagaaaattattatcatggattctaaaacggattaccagggatttcagtcgatgtacacgttcgtcacatctcatctacctccc 

ggttttaatgaatacgattttgtaccagagtcctttgatcgtgacaaaacaattgcactgataatgaattcctctggatctactgggttacc 

taagggtgtggcccttccgcatagaactgcctgcgtcagattctcgcatgccagagatcctatttttggcaatcaaatcattccggatac 

tgcgattttaagtgttgttccattccatcacggttttggaatgtttactacactcggatatttgatatgtggatttcgagtcgtcttaatgtata 

gatttgaagaagagctgtttttacgatcccttcaggattacaaaattcaaagtgcgttgctagtaccaaccctattttcattcttcgccaaa 

agcactctgattgacaaatacgatttatctaatttacacgaaattgcttctgggggcgcacctctttcgaaagaagtcggggaagcggt 

tgcaaaacgcttccatcttccagggatacgacaaggatatgggctcactgagactacatcagctattctgattacacccgagggggat 

gataaaccgggcgcggtcggtaaagttgttccattttttgaagcgaaggttgtggatctggataccgggaaaacgctgggcgttaatc 

agagaggcgaattatgtgtcagaggacctatgattatgtccggttatgtaaacaatccggaagcgaccaacgccttgattgacaagg 

atggatggctacattctggagacatagcttactgggacgaagacgaacacttcttcatagttgaccgcttgaagtctttaattaaataca 

aaggatatcaggtggcccccgctgaattggaatcgatattgttacaacaccccaacatcttcgacgcgggcgtggcaggtcttcccg 

acgatgacgccggtgaacttcccgccgccgttgttgttttggagcacggaaagacgatgacggaaaaagagatcgtggattacgtc 

gccagtcaagtaacaaccgcgaaaaagttgcgcggaggagttgtgtttgtggacgaagtaccgaaaggtcttaccggaaaactcg 

acgcaagaaaaatcagagagatcctcataaaggccaagaagggcggaaagtccaaattgcgcggccgctaactcgagaataaac 

aagttaacaacaacaattgcattcattttatgtttcaggttcagggggaggtgtgggaggttttttaaagcaagtaaaacctctacaaatg 

tggtatggctgattatgatccggctgcctcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtc 

acagcttgtctgtaagcggatgccgggagcagacaagcccgtcaggcgtcagcgggtgttggcgggtgtcggggcgcagccatg 

aggtcgactctagaggatcgatgccccgccccggacgaactaaacctgactacgacatctctgccccttcttcgcggggcagtgca 

tgtaatcccttcagttggttggtacaacttgccaactgggccctgttccacatgtgacacggggggggaccaaacacaaaggggttc 

tctgactgtagttgacatccttataaatggatgtgcacatttgccaacactgagtggctttcatcctggagcagactttgcagtctgtgga 

ctgcaacacaacattgcctttatgtgtaactcttggctgaagctcttacaccaatgctgggggacatgtacctcccaggggcccagga 

agactacgggaggctacaccaacgtcaatcagaggggcctgtgtagctaccgataagcggaccctcaagagggcattagcaata 

gtgtttataaggcccccttgttaaccctaaacgggtagcatatgcttcccgggtagtagtatatactatccagactaaccctaattcaata 

gcatatgttacccaacgggaagcatatgctatcgaattagggttagtaaaagggtcctaaggaacagcgatatctcccaccccatga 

gctgtcacggttttatttacatggggtcaggattccacgagggtagtgaaccattttagtcacaagggcagtggctgaagatcaagga 

gcgggcagtgaactctcctgaatcttcgcctgcttcttcattctccttcgtttagctaatagaataactgctgagttgtgaacagtaaggt 

gtatgtgaggtgctcgaaaacaaggtttcaggtgacgcccccagaataaaatttggacggggggttcagtggtggcattgtgctatg 

acaccaatataaccctcacaaaccccttgggcaataaatactagtgtaggaatgaaacattctgaatatctttaacaatagaaatccatg 

gggtggggacaagccgtaaagactggatgtccatctcacacgaatttatggctatgggcaacacataatcctagtgcaatatgatact 

ggggttattaagatgtgtcccaggcagggaccaagacaggtgaaccatgttgttacactctatttgtaacaaggggaaagagagtgg 

acgccgacagcagcggactccactggttgtctctaacacccccgciaaattaaacggggctccacgccaatggggcccataaacaa 
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agacaagtggccactcttttttttgaaattgtggagtgggggcacgcgtcagcccccacacgccgccctgcggttttggactgt^ 
taagggtgtaataacttggctgattgtaaccccgctaaccactgcggtcaaaccacttgcccacaaaaccactaatggcaccccggg 
gaatacctgcataagtaggtgggcgggccaagataggggcgcgattgctgcgatctggaggacaaattacacacacttgcgcctg 
agcgccaagcacagggttgttggtcctcatattcacgaggtcgctgagagcacggtgggctaatgttgccatgggtagcatatacta 
cccaaatatctggatagcatatgctatcctaatctatatctgggtagcataggctatcctaatctatatctgggtagcatatgctatcctaa 
tctatatctgggtagtatatgctatcctaatttatatctgggtagcataggctatcctaatctatatctgggtagcatatgctatcctaatctat 
atctgggtagtatatgctatcctaatctgtatccgggtagcatatgctatcctaatagagattagggtagtatatgctatcctaatttatatct 
gggtagcatatactacccaaatatctggatagcatatgctatcctaatctatatctgggtagcatatgctatcctaatctatatctgggtag 
cataggctatcctaatctatatctgggtagcatatgctatcctaatctatatctgggtagtatatgctatcctaatttatatctgggtagcata 
ggctatcctaatctatatctgggtagcatatgctatcctaatctatatctgggtagtatatgctatcctaatctgtatccgggtagcatatgc 
tatcctcatgcatatacagtcagcatatgatacccagtagtagagtgggagtgctatcctttgcatatgccgccacctcccaaggggg 
cgtgaattttcgctgcttgtccttttcctgctggttgctcccattcttaggtgaatttaaggaggccaggctaaagccgtcgcatgtctgat 
tgctcaccaggtaaatgtcgctaatgttttccaacgcgagaaggtgttgagcgcggagctgagtgacgtgacaacatgggtatgccc 
aattgccccatgttgggaggacgaaaatggtgacaagacagatggccagaaatacaccaacagcacgcatgatgtctactgggga 
tttattctttagtgcgggggaatacacggcttttaatacgattgagggcgtctcctaacaagttacatcactcctgcccttcctcaccctc 
atctccatcacctccttcatctccgtcatctccgtcatcaccctccgcggcagccccttccaccataggtggaaaccagggaggcaaa 
tctactccatcgtcaaagctgcacacagtcaccctgatattgcaggtaggagcgggctttgtcataacaaggtccttaatcgcatcctt 
caaaacctcagcaaatatatgagtttgtaaaaagaccatgaaataacagacaatggactcccttagcgggccaggttgtgggccgg 
gtccaggggccattccaaaggggagacgactcaatggtgtaagacgacattgtggaatagcaagggcagttcctcgccttaggttg 
taaagggaggtcttactacctccatatacgaacacaccggcgacccaagttccttcgtcggtagtcctttctacgtgactcctagccag 
gagagctcttaaaccttctgcaatgttctcaaatttcgggttggaacctccttgaccacgatgcttttccaaaccaccctccttttttgcgc 
cctgcctccatcaccctgaccccggggtccagtgcttgggccttctcctgggtcatctgcggggccctgctctatcgctcccggggg 
cacgtcaggctcaccatctgggccaccttcttggtggtattcaaaataatcggcttcccctacagggtggaaaaatggccttctacctg 
gagggggcctgcgcggtggagacccggatgatgatgactgactactgggactcctgggcctcttttctccacgtccacgacctctc 
cccctggctctttcacgacttccccccctggctctttcacgtcctctaccccggcggcctccactacctcctcgaccccggcctccact 
acctcctcgaccccggcctccactgcctcctcgaccccggcctccacctcctgctcctgcccctcctgctcctgcccctcctcctgct 
cctgcccctcctgcccctcctgctcctgcccctcctgcccctcctgctcctgcccctcctgcccctcctgctcctgcccctcctgcccct 
cctcctgctcctgcccctcctgcccctcctcctgctcctgcccctcctgcccctcctgctcctgcccctcctgcccctcctgctcctgcc 
cctcctgcccctcctgctcctgcccctcctgctcctgcccctcctgctcctgcccctcctgctcctgcccctcctgcccctcctgcccct 
cctcctgctcctgcccctcctgctcctgcccctcctgcccctcctgcccctcctgctcctgcccctcctcctgctcctgcccctcctgcc 
cctcctgcccctcctcctgctcctgcccctcctgcccctcctcctgctcctgcccctcctcctgctcctgcccctcctgcccctcctgcc 
cctcctcctgctcctgcccctcctgcccctcctcctgctcctgcccctcctcctgctcctgcccctcctgcccctcctgcccctcctcct 
gctcctgcccctcctcctgctcctgcccctcctgcccctcctgcccctcctgcccctcctcctgctcctgcccctcctcctgctcctgcc 
cctcctgctcctgcccctcccgctcctgctcctgctcctgttccaccgtgggtccctttgcagccaatgcaacttggacgtttttggggt 
ctccggacaccatctctatgtcttggccctgatcctgagccgcccggggctcctggtcttccgcctcctcgtcctcgtcctcttccccgt 
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cctcgtccatggttatcaccccctcttctttgaggtccactgccgccggagccttctggtccagatgtgtctcccttctctcctaggccat 

ttccaggtcctgtacctggcccctcgtcagacatgattcacactaaaagagatcaatagacatctttattagacgaGgctcagtgaata 

cagggagtgcagactcctgccccctccaacagcccccccaccctcatccccttcatggtcgctgtcagacagatccaggtctgaaa 

attccccatcctccgaaccatcctcgtcctcatcaccaattactcgcagcccggaaaactcccgctgaacatcctcaagatttgcgtcc 

tgagcctcaagccaggcctcaaattcctcgtccccctttttgctggacggtagggatggggattctcgggacccctcctcttcctcttc 

aaggtcaccagacagagatgctactggggcaacggaagaaaagctgggtgcggcctgtgaggatcagcttatcgatgataagctg 

tcaaacatgagaattcttgaagacgaaagggcctcgtgatacgcctatttttataggttaatgtcatgataataatggtttcttagacgtca 

ggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgtatccgctcatgagacaataacc 

ctgataaatgcttcaataatattgaaaaaggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgcctt 

cctgtttttgctcacccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaactggatc 

tcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcacttttaaagttctgctatgtggcgcggtat 

tatcccgtgttgacgccgggcaagagcaactcggtcgccgcatacactattctcagaatgacttggttgagtactcaccagtcacaga 

aaagcatcttacggatggcatgacagtaagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctg 

acaacgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaaccgga 

gctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgcagcaatggcaacaacgttgcgcaaactattaactggc 

gaactacttactctagcttcccggcaacaattaatagactggatggaggcggataaagttgcaggaccacttctgcgctcggcccttc 

cggctggctggtttattgctgataaatctggagccggtgagcgtgggtctcgcggtatcattgcagcactggggccagatggtaagc 

cctcccgtatcgtagttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcctcac 

tgattaagcattggtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaatttaaaaggatclaggtgaa 

gatcctttttgataatctcatgaccaaaatcccttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatc 

ttcttgagatcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagag 

ctaccaactctttttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgtagccgtagttaggccacca 

cttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttac 

cgggttggactcaagacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttgga 

gcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaagggagaaaggcggaca 

ggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgt 

cgggtttcgccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcgg 

cctttttacggttcctggccttttgctggccttgaagctgtccctgatggtcgtcatctacctgcctggacagcatggcctgcaacgcgg 

gcatcccgatgccgccggaagcgagaagaatcataatggggaaggccatccagcctcgcgtcgcgaacgccagcaagacgtag 

cccagcgcgtcggccccgagatgcgccgcgtgcggctgctggagatggcggacgcgatggatatgttctgccaagggttggtttg 

cgcattcacagttctccgcaagaattgattggctccaattcttggagtggtgaatccgttagcgaggtgccgccctgcttcatccccgt 

ggcccgttgctcgcgtttgctggcggtgtccccggaagaaatatatttgcatgtctttagttctatgatgacacaaaccccgcccagcg 

tcttgtcattggcgaattcgaacacgcagatgcagtcggggcggcgcggtccgaggtccacttcgcatattaaggtgacgcgtgtg 

gcctcgaacaccgagcgaccctgcagcgacccgcttaacagcgtcaacagcgtgccgcagatcccggggggcaatgagatatg 

aaaaagcctgaactcaccgcgacgtctgtcgagaagtttctgatcgaaaagttcgacagcgtctccgacctgatgcagctctcggag 
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ggcgaagaatctcgtgctttcagcttcgatgtaggagggcgtggatatgtcctgcgggtaaatagctgcgccgatggtttctacaaag 
atcgttatgtttatcggcactttgcatcggccgcgctcccgattccggaagtgcttgacattggggaattcagcgagagcctgacctat 
tgcatctcccgccgtgcacagggtgtcacgttgcaagacctgcctgaaaccgaactgcccgctgttctgcagccggtcgcggagg 
ccatggatgcgatcgctgcggccgatcttagccagacgagcgggttcggcccattcggaccgcaaggaatcggtcaatacactac 
atggcgtgatttcatatgcgcgattgctgatccccatgtgtatcactggcaaactgtgatggacgacaccgtcagtgcgtccgtcgcg 
caggctctcgatgagctgatgctttgggccgaggactgccccgaagtccggcacctcgtgcacgcggatttcggctccaacaatgt 
cctgacggacaatggccgcataacagcggtcattgactggagcgaggcgatgttcggggattcccaatacgaggtcgccaacatc 
ttcttctggaggccgtggttggcttgtatggagcagcagacgcgctacttcgagcggaggcatccggagcttgcaggatcgccgcg 
gctccgggcgtatatgctccgcattggtcttgaccaactctatcagagcttggttgacggcaatttcgatgatgcagcttgggcgcag 
ggtcgatgcgacgcaatcgtccgatccggagccgggactgtcgggcgtacacaaatcgcccgcagaagcgcggccgtctggac 
cgatggctgtgtagaagtactcgccgatagtggaaaccgacgccccagcactcgtccggatcgggagatgggggaggctaactg 
aaacacggaaggagacaataccggaaggaacccgcgctatgacggcaataaaaagacagaataaaacgcacgggtgttgggtc 
gtttgttcataaacgcggggttcggtcccagggctggcactctgtcgataccccaccgagaccccattggggccaatacgcccgcg 
tttcttccttttccccaccccaccccccaagttcgggtgaaggcccagggctcgcagccaacgtcggggcggcaggccctgccata 
gccactggccccgtgggttagggacggggtcccccatggggaatggtttatggttcgtgggggttattattttgggcgttgcgtggg 
gtcaggtccacgactggactgagcagacagacccatggtttttggatggcctgggcatggaccgcatgtactggcgcgacacgaa 
caccgggcgtctgtggctgccaaacacccccgacccccaaaaaccaccgcgcggatttctggcgtgccaagctagtcgaccaatt 
ctcatgtttgacagcttatcatcgcagatccgggcaacgttgttgccattgctgcaggcgcagaactggtaggtatggaagatctatac 
attgaatcaatattggcaattagccatattagtcattggttatatagcataaatcaatattggctattggccattgcatacgttgtatctatat 
cataatatgtacatttatattggctcatgtccaatatgaccgccat 
[0481] pMCPl (SEQ ID NO: 94) 

gacggatcgggagatctcccgatcccctatggtgcactctcagtacaatctgctctgatgccgcatagttaagccagtatctgctccct 
gcttgtgtgttggaggtcgctgagtagtgcgcgagcaaaatttaagctacaacaaggcaaggcttgaccgacaattgcatgaagaat 
ctgcttagggttaggcgttttgcgctgcttcgcgatgtacgggccagatatacgcgttgacattgattattgactagttattaatagtaatc 
aattacggggtcattagttcatagcccatatatggagttccgcgttacataacttacggtaaatggcccgcctggctgaccgcccaac 
gacccccgcccattgacgtcaataatgacgtatgttcccatagtaacgccaatagggactttccattgacgtcaatgggtggagtattt 
acggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgc 
ctggcattatgcccagtacatgaccttatgggactttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcgg 
ttttggcagtacatcaatgggcgtggatagcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttt 
tggcaccaaaatcaacgggactttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtggga 
ggtctatataagcagagctctctggctaactaagctttcggcgcgccgaggtaccatgggatccgaagacgccaaaaacataaaga 
aaggcccggcgccattctatcctctagaggatggaaccgctggagagcaactgcataaggctatgaagagatacgccctggttcct 
ggaacaattgcttttacagatgcacatatcgaggtgaacatcacgtacgcggaatacttcgaaatgtccgttcggttggcagaagctat 
gaaacgatatgggctgaatacaaatcacagaatcgtcgtatgcagtgaaaactctcttcaattctttatgccggtgttgggcgcgttattt 
atcggagttgcagttgcgcccgcgaacgacatttataatgaacgtgaattgctcaacagtatgaacatttcgcagcctaccgtagtgtt 
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tgtttccaaaaaggggttgcaaaaaattttgaacgtgcaaaaaaaattaccaataatccagaaaattattatcatggattctaaaacgga 
ttaccagggatttcagtcgatgtacacgttcgtcacatctcatctacctcccggttttaatgaatacgattttgtaccagagtcctttgatcg 
tgacaaaacaattgcactgataatgaattcctctggatctactgggttacctaagggtgtggcccttccgcatagaactgcctgcgtca 
gattctcgcatgccagagatcctatttttggcaatcaaatcattccggatactgcgattttaagtgttgttccattccatcacggttttggaa 
tgtttactacactcggatatttgatatgtggatttcgagtcgtcttaatgtatagatttgaagaagagctgtttttacgatcccttcaggatta 
caaaattcaaagtgcgttgctagtaccaaccctattttcattcttcgccaaaagcactctgattgacaaatacgatttatctaatttacacg 
aaattgcttctgggggcgcacctctttcgaaagaagtcggggaagcggttgcaaaacgcttccatcttccagggatacgacaaggat 
atgggctcactgagactacatcagctattctgattacacccgagggggatgataaaccgggcgcggtcggtaaagttgttccatttttt 
gaagcgaaggttgtggatctggataccgggaaaacgctgggcgttaatcagagaggcgaattatgtgtcagaggacctatgattat 
gtccggttatgtaaacaatccggaagcgaccaacgccttgattgacaaggatggatggctacattctggagacatagcttactggga 
cgaagacgaacacttcttcatagttgaccgcttgaagtctttaattaaatacaaaggatatcaggtggcccccgctgaattggaatcga 
tattgttacaacaccccaacatcttcgacgcgggcgtggcaggtcttcccgacgatgacgccggtgaacttcccgccgccgttgttgt 
tttggagcacggaaagacgatgacggaaaaagagatcgtggattacgtcgccagtcaagtaacaaccgcgaaaaagttgcgcgg 
aggagttgtgtttgtggacgaagtaccgaaaggtcttaccggaaaactcgacgcaagaaaaatcagagagatcctcataaaggcca 
agaagggcggaaagtccaaattgcgcggccgctaactcgagaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattc 
tattctggggggtggggtggggcaggacagcaagggggaggattgggaagacaatagcaggcatgctggggatgcggtgggct 
ctatggcttctgaggcggaaagaaccagctggggctctagggggtatccccacgcgccctgtagcggcgcattaagcgcggcgg 
gtgtggtggttacgcgcagcgtgaccgctacacttgccagcgccctagcgcccgctcctttcgctttcttcccttcctttctcgccacgt 
tcgccggctttccccgtcaagctctaaatcgggggtccctttagggttccgatttagtgctttacggcacctcgaccccaaaaaacttg 
attagggtgatggttcacgtacctagaagttcctattccgaagttcctattctctagaaagtataggaacttccttggccaaaaagcctg 
aactcaccgcgacgtctgtcgagaagtttctgatcgaaaagttcgacagcgtctccgacctgatgcagctctcggagggcgaagaa 
tctcgtgctttcagcttcgatgtaggagggcgtggatatgtcctgcgggtaaatagctgcgccgatggtttctacaaagatcgltatgttt 
atcggcactttgcatcggccgcgctcccgattccggaagtgcttgacattggggaattcagcgagagcctgacctattgcatctcccg 
ccgtgcacagggtgtcacgttgcaagacctgcctgaaaccgaactgcccgctgttctgcagccggtcgcggaggccatggatgcg 
atcgctgcggccgatcttagccagacgagcgggttcggcccattcggaccgcaaggaatcggtcaatacactacatggcgtgattt 
catatgcgcgattgctgatccccatgtgtatcactggcaaactgtgatggacgacaccgtcagtgcgtccgtcgcgcaggctctcgat 
gagctgatgctttgggccgaggactgccccgaagtccggcacctcgtgcagcaaacaaaccaccgctggtagcggtttttttgtttg 
caagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaa 
ctcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaag 
tatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgc 
ctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgct 
caccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatc 
cagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctacaggcatcgtg 
gtgtcacgctcgtcgtttggtatggcttcattcagetccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaa 
agcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattc 
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tcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcg 
ttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcgggg 
cgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttca 
ccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcat 

actcttcctttttcaatattattgaagGatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaata 

ggggttccgcgcacatttccccgaaaagtgccacctgacgtc 

[0482] Equivalents : 

[0483] The present invention is not to be limited in scope by the specific embodiments 
described herein. Indeed, various modifications of the invention in addition to those 
described will become apparent to those skilled in the art from the foregoing description 
and accompanying figures. Such modifications are intended to fall within the scope of the 
appended claims. 

[0484] Various publications are cited herein, the disclosures of which are incorporated 
by reference in their entireties. 
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1 . A method for identifying a compoiind that modulates untranslated region- 
dependent expression of a vascular endotliehal growth factor (VEGF) gene, said method 
comprising: 

(a) contacting a member of a library of compounds with a cell containing 
a nucleic acid comprising a reporter gene operably linked to an UTR 
of the VEGF gene; and 

(b) detecting a reporter protein translated from said reporter gene, 
wherein a compound that modulates untranslated region-dependent 
expression of a VEGF gene is identified if the expression of said 
reporter gene in the presence of a compound is altered as compared 
to the expression of said reporter gene in the absence of said 
compound or the presence of a control. 

2. A method for identifying a compound that modulates untranslated region- 
dependent expression of a VEGF gene, said method comprising: 

(a) contacting a member of a library of compounds with a cell-free 
translation mixture and a nucleic acid comprising a reporter gene 
operably linked to an UTR of the VEGF gene; and 

(b) detecting the expression of said reporter gene, wherein a compound 
that modulates unfranslated region-dependent expression of a VEGF 
gene is identified if the expression of said reporter gene in the 
presence of a compound is altered as compared to the expression of 
said reporter gene in the absence of said compound or the presence of 
a control. 

3. The method of claim 1 or 2, wherein the UTR of the VEGF gene is the 5' 
untranslated region (5' UTR) of a VEGF gene. 

4. The method of claim 3 wherein the 5 ' UTR of the VEGF gene is operably 
linked upstream of the reporter gene. 

5. The method of claim 1 or 2, wherein the UTR of VEGF gene is the 3' 
untranslated region (3' UTR) of a VEGF gene. 
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6 . L he method of claim 5,whereiB the 3 ' UTR of the VEGF gene is operably 
linked downstream of the reporter gene. 



7. The method of claim 3,wherein the nucleic acid fiirther comprises the 3 ' 
UTR of a VEGF gene. 

8 . The method of claim 7,wherein the 3 ' UTR of the VEGF gene is operably 
hnlced downstream of the reporter gene. 

9. The method of claim 1 or 2, wherein the reporter gene further comprises an 

intron. 

10. The method of claim 1 or 2, wherein the UTR of the VEGF gene comprises 
an iron response element ("IRE"), internal ribosome entry site ("IRES"), upstream open 
reading frame ("uORF"), or AU-rich element ("ARE"). 

1 1 . The method of claim 1 or 2, wherein the nucleic acid is further 
polyadenylated at the 3' end. 

12. The method of claim 1 or 2, wherein the nucleic acid is not capped at the 5 ' 

end. 

13. The method of claim 1 or 2, wherein the reporter gene encodes firefly 
luciferase, renilla luciferase, chck beetle luciferase, green fluorescent protein, yellow 
fluorescent protein, red fluorescent protein, cyan fluorescent protein, blue fluorescent 
protein, beta-galactosidase, beta-glucoronidase, beta-lactamase, chloramphenicol 
acetyltransferase, or alkaline phosphatase. 

14. The method of claim 1, wherein said cell is stably transfected with said 
nucleic acid, 

15. The method of claim 1, wherein said cell is transiently transfected with said 
nucleic acid. 

16. The method of claim 1, wherein said cell is transfected with an episomal 
expression vector comprising said nucleic acid. 
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1 /. 1 jtie method of claim 1 or 2 further comprising measuring the effect of said 
compound on the expression of the VEGF gene. 



1 8. The method of claim 1 , wherein the cell is a human cell, a yeast cell, a 
mouse cell, a rat cell, a Chinese hamster ovary ("CHO") cell, a MCF-7 cell, a primary cell, 
or an undifferentiated cancer cell. 

1 9. The method of claim 1 8 wherein the human cell is a HeLa cell or a 293 cell. 

20. The method of claim 3, wherein the cell-free translation mixture is a cell 

extract. 

21. The method of claim 20, wherein the cell extract is derived from is a human 
cell, a yeast cell, a mouse cell, a rat cell, a Chinese hamster ovary ("CHO") cell, a Xenopus 
oocyte, a MCF-7 cell, a primary cell, an undifferentiated cancer cell, a reticulocs^e, or a rye 
embryo. 

22. The method of claim 1 or 2, wherein the compound is selected from a 
combinatorial library of compounds comprising peptoids, random bioohgomers, 
diversomers, vinylogous polypeptides, nonpeptidal peptidomimetics, oligocarbamates, 
peptidyl phosphonates, peptide nucleic acid libraries, antibody libraries, carbohydrate 
libraries, and small organic molecule libraries. 

23 . The method of claim 22, wherein the small organic molecule libraries are 
libraries of benzodiazepines, isoprenoids, thiazoUdinones, metathiazanones, pyrrolidines, 
morpholino compounds, or diazepindiones. 

24. The method of claim 1, wherein the step of contacting a library of 
compounds with a cell is ia an aqueous solution comprising a buffer and a combination of 
salts. 

25. The method of claim 24, wherein the aqueous solution approximates or 
mimics physiologic conditions. 

26. The method of claim 24, wherein the aqueous solution further comprises a 
detergent or a surfactant. 
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27. The method of claim 1 or 2 fiuther comprising (c) detennining the stracture 
of the compound that modulates untranslated region-dependent expression of the VEGF 
gene. 

28. The method of claim 27, wherein the structure of the compound is 
determined by mass spectroscopy, NMR, vibrational spectroscopy, or X-ray 
crystallography. 

29. The method of claim 1 or 2, wherein the compound directly binds to an RNA 
transcribed from the VEGF gene. 

30. The method of claim 1 or 2, wherein the compound biads to one or more 
proteins that modulate untranslated region-dependent expression of the VEGF gene. 

3 1 . The method of claim 7, wherein the compound disrupts an interaction 
between the 5' UTR and the 3' UTR of the VEGF gene. 

32. A method of modulating the expression of a VEGF gene comprising 
contacting a cell expressing the VEGF gene with a compound, or a pharmaceutically 
acceptable salt thereof, identified according the method of claim 1 or 2. 

33 . A method of treating, preventing or ameliorating cancer or one or more 
symptoms thereof, said method comprising administering to a subject in need thereof an 
effective amoimt of a compound, or a pharmaceutically acceptable salt thereof, identified 
according to the method of claim 1 or 2, wherein said effective amount decreases the 
expression of the VEGF gene. 

34. A method of inhibiting or reducing angiogenesis, said method comprising 
administering to a subject in need thereof a prophylactically or therapeutically effective 
amount of a compound, or a pharmaceutically acceptable salt thereof, identified according 
to the method of claim 1 or 2, wherein said effective amount decreases the expression of the 
VEGF gene. 

35. A method of identifying a compound that inhibits or reduces angiogenesis, 
said method comprising: 
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(aL) contacting a member of a library of compomids with a cell contaiiiing 
a nucleic acid comprising a reporter gene operably linked to an UTR 
of a VEGF gene; and 

(b) detecting the expression of said reporter gene, wherein if a compound 
that reduces the expression of said reporter gene relative to the 
expression of said reporter gene in the absence of said compound or 
the presence of a control is detected in (b), then 

(c) contacting the compound with a tumor cell and detecting the 
proliferation of said tumor cell, so that if the compound reduces or 
inhibits the proliferation of the tumor cell, the compound is identified 
as a compomd that inhibits or reduces angiogenesis. 



36. The method of claim 38 further comprising (d) testing said compound in an 
animal model for angiogenesis, wherein said testing comprises administering said 
compound to said animal model and verifying that angiogenesis is inhibited by said 
compound in said animal model. 

37. A method of treating, preventing or ameliorating cancer or one or more 
symptoms thereof, said method comprising administering to a subject in need thereof an 
effective amount of a compound, or a pharmaceutically acceptable salt thereof, identified 
according to the method of claim 35. 

38. A method of inhibiting or reducing angiogenesis, said method comprising 
administering to a subject in need thereof an effective amount of a compound, or a 
pharmaceutically acceptable salt thereof, identified according to the method of claini 35. 

39. A method of identifying a compound that inhibits or reduces angiogenesis, 
said method comprising: 

(a) contacting a member of a library of compounds with a cell-firee 
translation mixture and a nucleic acid comprising a reporter gene 
operably linked to an UTR of a VEGF gene; and 

(b) detecting the expression of said reporter gene, wherein if a compound 
that reduces the expression of said reporter gene relative to the 
expression of said reporter gene in the absence of said compound or 
the presence of a control is detected in (b), then 
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(c) contacting the compound with a tumor cell and detecting the 

proliferation of said tumor cell, so that if the compound reduces or 
inhibits the proliferation of the tumor cell, the compound is identified 
as a compound that inhibits or reduces angiogenesis. 

40. The method of claim 39 further comprising (d) testing said compound in an 
animal model for angiogenesis, wherein said testing comprises administering said 
compound to said animal model and verifying that angiogenesis is inhibited by said 
compound in said animal model. 

41 . A method of inhibiting or reducing angiogenesis, said method comprising 
adrninistering to a subject in need thereof an effective amount of a compound, or a 
pharmaceutically acceptable salt thereof, identified according to the method of claim 39. 

42. A method of treating, preventing or ameliorating cancer or one or more 
symptoms thereof, said method comprising administering to a subject in need thereof an 
effective amount of a compound, or a pharmaceutically acceptable salt thereof, identified 
according to the method of claim 39. 

43. The method of claim 1 or 2 further comprising determining the specificity of 
the compound for the VEGF untranslated region. 
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FIGURE 2A 
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FIGURES 3A TO 3B 
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FIGURE 8A 
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FIGURE 8B 
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SEQUENCE LISTING 

<110> PTC Therapeutics, Inc. 

<120> METHODS FOR IDENTIFYING COMPOUNDS THAT MODULATE UNTRANSLATED 
REGION-DEPENDENT GENE EXPRESSION AND METHODS OF USING SAME 

<130> 10589-012-228 

<140> 
<141> 

<150> 60/441,637 

<151> 2003-01-21 

<160> 94 

<170> PatentIn version 3.2 

<210> 1 

<211> 14 

<212> DNA 

<213> Artificial 

<220> 

<22 3> Description of Artificial Sequence: Motif 



<220> 

<221> inisc_f eature 

<222> 3, 7, 8, 11 

<223> n = a, t, c, or g 

<220> 

<221> misc_f eature 
<222> (7) . . (8) 

<223> This represents one form of the sequence as described, other forms 
described may have up to five nucleotides in this variable region 

<400> 1 

ggntggnngg ntgg 14 



<210> 2 

<211> 14 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequenie: Motif 



<220> 

<221> mi sc_f eature 

<222> 3, 4, 7, 8, 11, 12 

<223> n = a, t, g or c 

<220> 

<221> misc_feature 

<222> (2) . . (12) 



1 
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<223> This represents one form of the sequence as described, other forms 
described have longer variable regions, typical is 2 - 10 
nucleotides 

<400> 2 

ggnnggnngg nngg 14 



<210> 3 

<211> 14 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Motif 



<220> 

<221> raisc_f eature 

<222> 3, 4, 7, 8, 11, 12 

<223> n = a, t, g, or c 

<220> 

<221> raisc_f eature 
<222> (2) . . (12) 

<223> This represents one form of the sequence as described, other forms 
described have longer variable regions, typical is 2 - 10 
nucleotides 

<400> 3 

ggnnggnngg nngg 14 



<210> 4 

<211> 19 

<212> RNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Motif 

<400> 4 

ccccrcGcuc uuccccaag 19 



<210> 5 

<211> 152 

<212> DNA 

<213> Homo sapiens 

<400> 5 

gcagaggacc agctaagagg gagagaagca actacagacc ccccctgaaa acaaccctca 60 

gacgccacat cccctgacaa gctgccaggc aggttctctt cctctcacat actgacccac 120 

ggctccaccc tctctcccct ggaaaggaca cc 152 



<210> 6 
<211> 792 
<212> DNA 



2 
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<213> Homo sapiens 
<400> 6 



tgaggaggac 


gaacatccaa 


ccttcccaaa 


cgcctcccct 


gccccaatcc 


Gtttattacc 


Gcctccttca 


gacaccctca 


acctcttctg 


gctcaaaaag 


agaattgggg gcttagggtc 


ggaacccaag 


cttagaactt 


taagcaacaa 


gaccaccact 


tcgaaacctg 


ggattcagga 


atgtgtggcc 


tgcacagtga 


attgctggca 


accactaaga 


attcaaactg 


gggcctccag 


aactcactgg 


ggcctacagc 


tttgatccct 


gacatctgga 


atctggagac 


cagggagcct 


ttggttctgg 


ccagaatgct 


gcaggacttg 


agaagacctc 


acctagaaat 


tgacacaagt 


ggaccttagg 


ccttcctctc 


tccagatgtt 


tccagacttc 


cttgagacac 


ggagcccagc 


cctccccatg 


gagccagctc 


cctctattta 


tgtttgcact 


tgtgattatt 


tattatttat 


ttattattta 


tttatttaca 


gatgaatgta 


tttatttggg 


agaccggggt 


atcctggggg 


acccaatgta 


ggagctgcct 


tggctcagac 


atgttttccg 


tgaaaacgga gctgaacaat 


aggctgttcc 


catgtagccc 


cctggcctct 


gtgccttctt 


ttgattatgt 


tttttaaaat 


atttatctga 


ttaagttgtc 


taaacaatgc 


tgatttggtg 


accaactgtc 


actcattgct 


gagcctctgc 


tccccagggg 


agttgtgtct 


gtaatcgccc 


tactattcag 


tggcgagaaa 


taaagtttgc 


tt 











60 
120 
180 
240 
300 
360 
420 
480 
540 
500 
660 
720 
780 
792 



<210> 7 

<211> 21 

<212> RNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Motif 

<400> 7 

auuuauuuau uuauuuauuu a 21 



<210> 8 

<211> 40 

<212> DNA 

<213> Homo sapiens 

<400> 8 

kctggaggat gtggctgcag agcctgctgc tcttgggcac 40 



<210> 9 

<211> 289 

<212> DNA 

<213> Homo sapiens 

<400> 9 

gccggggagc tgctctctca tgaaacaaga gctagaaact caggatggtc atcttggagg 60 
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gaccaagggg tgggccacag ccatggtggg agtggcctgg acctgccctg ggccacactg 12 0 

accctgatac aggcatggca gaagaatggg aatattttat actgacagaa atcagtaata 180 

tttatatatt tatattttta aaatatttat ttatttattt atttaagttc atattccata 240 

tttattcaag atgttttacc gtaataatta ttattaaaaa tatgcttct 289 



<210> 10 

<211> 21 

<212> RNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Motif 

<400> 10 

auuuauuuau uuauuuauuu a 21 



<210> 11 

<211> 47 

<212> DNA 

<213> Homo sapiens 

<400> 11 

atcactctct ttaatcacta ctcacattaa cctcaactcc tgccaca 47 



<210> 12 

<211> 307 

<212> DNA 

<213> Homo sapiens 

<400> 12 

taattaagtg cttcccactt aaaacatatc aggccttcta tttatttatt taaatattta 60 

aattttatat ttattgttga atgtatggtt gctacctatt gtaactatta ttcttaatct 12 0 

taaaactata aatatggatc ttttatgatt ctttttgtaa gccctagggg ctctaaaatg 180 

gtttacctta tttatcccaa aaatatttat tattatgttg aatgttaaat atagtatcta 240 

tgtagattgg ttagtaaaac tatttaataa atttgataaa tataaaaaaa aaaaacaaaa 30 0 

aaaaaaa 307 



<210> 13 

<211> 15 

<212> RNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Motif 



<220> 

<221> misc_feature 
<222> (1) . . (15) 
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<223> n = a, t, g or c 
<400> 13 

nauuuauuua uuuan 15 



<210> 14 

<211> 62 

<212> DNA 

<213> Homo sapiens 

<400> 14 

ttctgccctc gagcccaccg ggaacgaaag agaagctcta tctcgcctcc aggagcccag 60 
ct 62 



<210> 15 

<211> 427 

<212> DNA 

<213> Homo sapiens 

<400> 15 

tagcatgggc acctcagatt gttgttgtta atgggcattc cttcttctgg tcagaaacct 60 

gtccactggg cacagaactt atgttgttct ctatggagaa ctaaaagtat gagcgttagg 120 

acactatttt aattattttt aatttattaa tatttaaata tgtgaagctg agttaattta 180 

tgtaagtcat atttatattt ttaagaagta ccacttgaaa cattttatgt attagttttg 240 

aaataataat ggaaagtggc tatgcagttt gaatatcctt tgtttcagag ccagatcatt 3 00 

tcttggaaag tgtaggctta cctcaaataa atggctaact tatacatatt tttaaagaaa 3 60 

tatttatatt gtatttatat aatgtataaa tggtttttat accaataaat ggcattttaa 42 0 

aaaattc 427 



<210> IS 

<211> 15 

<212> ENA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Motif 



<220> 

<22l> misc_feature 

<222> (1) . . (15) 

<223> n = a, t, g or c 

<400> 16 

nauuuauuua uuuan 15 



<210> 17 
<211> 701 
<212> DNA 
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<213> Homo sapiens 



<400> 17 

aagagctcca gagagaagtc gaggaagaga gagacggggt cagagagagc 


gcgcgggcgt 


60 


gcgagcagcg 


aaagcgacag gggcaaagtg agtgacctgc ttttgggggt gaccgccgga 


120 


gcgcggcgtg 


agccctcccc cttgggatcc cgcagctgac cagtcgcgct 


gacggacaga 


180 


cagacagaca 


ccgcccccag cGCcagttac cacctcctcc ccggccggcg 


gcggacagtg 


240 


gacgcggcgg 


cgagccgcgg gcaggggccg gagcccgccc ccggaggcgg 


ggtggagggg 


300 


gtcggagctc gcggcgtcgc actgaaactt ttcgtccaac ttctgggctg 


ttctcgcttc 


360 


ggaggagccg 


tggtccgcgc gggggaagcc gagccgagcg gagccgcgag aagtgctagc 


420 


tcgggccggg 


aggagccgca gccggaggag ggggaggagg aagaagagaa 


ggaagaggag 


480 


agggggccgc 


agtggcgact cggcgctcgg aagccgggct catggacggg 


tgaggcggcg 


540 


gtgtgcgcag 


acagtgctcc agcgcgcgcg ctccccagcc ctggcccggc 


ctcgggccgg 


600 


gaggaagagt 


agctcgccga ggcgccgagg agagcgggcc gccccacagc 


ccgagccgga 


660 


gagggacgcg 


agccgcgcgc cccggtcggg cctccgaaac c 




701 


<:210> 18 

<:211> 1892 

<212> DNA 

<213> Homo sapiens 






<:400> 18 
tgagccgggc 


aggaggaagg agcctccctc agggtttcgg gaaccagatc 


tctctccagg 


60 


aaagactgat 


acagaacgat cgatacagaa accacgctgc cgccaccaca 


ccatcaccat 


120 


cgacagaaca 


gtccttaatc cagaaacctg aaatgaagga agaggagact 


ctgcgcagag 


180 


cactttgggt 


ccggagggcg agactccggc ggaagcattc ccgggcgggt 


gacccagcac 


240 


ggtccctctt 


ggaattggat tcgccatttt atttttcttg ctgctaaatc 


accgagcccg 


300 


gaagattaga 


gagttttatt tctgggattc ctgtagacac acccacccac 


atacatacat 


360 


ttatatatat 


atatattata tatatataaa aataaatatc tctattttat 


atatataaaa 


420 


tatatatatt 


ctttttttaa attaacagtg ctaatgttat tggtgtcttc 


actggatgta 


480 


tttgactgct 


gtggacttga gttgggaggg gaatgttccc actcagatcc 


tgacagggaa 


540 


gaggaggaga 


tgagagactc tggcatgatc ttttttttgt cccacttggt 


ggggccaggg 


600 


tcctctcccc 


tgcccaagaa tgtgcaaggc cagggcatgg gggcaaatat 


gacccagttt 


660 


tgggaacacc 


gacaaaccca gccctggcgc tgagcctctc taccccaggt 


cagacggaca 


72 0 


gaaagacaaa 


tcacaggttc cgggatgagg acaccggctc tgaccaggag 


tttggggagc 


780 


ttcaggacat 


tgctgtgctt tggggattcc ctccacatgc tgcacgcgca 


tctcgcGccc 


840 
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aggggcactg cctggaagat tcaggagcct gggcggcctt cgcttactct cacctgcttc 900 

tgagttgccc aggaggccac tggcagatgt cccggcgaag agaagagaca cattgttgga 960 

agaagcagcc catgacagcg ccccttcctg ggactcgccc tcatcctctt cctgctcccc 102 0 

ttcctggggt gcagcctaaa aggacctatg tcctcacacc attgaaacca ctagttctgt 1080 

ccccccagga aacctggttg tgtgtgtgtg agtggttgac cttcctccat cccctggtcc 1140 

ttcccttccc ttcccgaggc acagagagac agggcaggat ccacgtgccc attgtggagg 12 0 0 

cagagaaaag agaaagtgtt ttatatacgg tacttattta atatcccttt ttaattagaa 1260 

attagaacag ttaatttaat taaagagtag ggtttttttt cagtattctt ggttaatatt 132 0 

taatttcaac tatttatgag atgtatcttt tgctctctct tgctctctta tttgtaccgg 13 80 

tttttgtata taaaattcat gtttccaatc tctctctccc tgatcggtga cagtcactag 1440 

cttatcttga acagatattt aattttgcta acactcagct ctgccctccc cgatcccctg 1500 

gctccccagc acacattcct ttgaaagagg gtttcaatat acatctacat actatatata 1560 

tattgggcaa cttgtatttg tgtgtatata tatatatata tgtttatgta fcatatgtgat 1620 

cctgaaaaaa taaacatcgc tattctgttt tttatatgtt caaaccaaac aagaaaaaat 1680 

agagaattct acatactaaa tctctictcct tttttaattt taatatttgt tatcatttat 1740 

ttattggtgc tactgtttat ccgtaataat tgtggggaaa agatattaac atcacgtctt 1800 

tgtctctagt gcagtttttc gagatattcc gtagtacata tttattttta aacaacgaca i860 

aagaaataca gatatatctt aaaaaaaaaa aa 1892 



<210> 19 

<211> 249 

<212> RNA 

<213> Homo sapiens 

<400> 19 

ccgggcucau ggacggguga ggcggcggug ugcgcagaca gugcuccagc gcgcgcgcuc 



60 
120 



249 



cccagcccug gcccggccuc gggccgggag gaagaguagc ucgccgaggc gccgaggaga 
gcgggccgcc ccacagcccg agccggagag ggacgcgagc cgcgcgcccc ggucgggccu 180 
ccgaaaccau gaacuuucug cugucuuggg ugcauuggag ccuugccuug cugcucuacc 240 
uccaccaug 



<210> 20 

<211> 15 

<212> RNA 

<213> Artificial 

<220> 
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<223> Description of Artificial Sequence: Motif 



<220> 

<2 21> misc_feature 

<222> (1) . . (15) 

<223> n = a, t, g or c 

<400> 20 
nauuuauuua uuuan 



15 



<210> 21 

<211> 49 

<212> DNA 

<213> Homo sapiens 

<400> 21 

ccgccagatt tgaatcgcgg gacccgttgg cagaggtggc ggcggcggc 49 

<210> 22 

<211> 1141 

<212> DNA 

<213> Homo sapiens 

<400> 22 

ggcctctggc cggagctgcc tggtcccaga gtggctgcac cacttccagg gtttattccc 60 

tggtgccacc agccttcctg tgggcccctt agcaatgtct taggaaagga gatcaacatt 12 0 

ttcaaattag atgtttcaac tgtgctcctg ttttgtcttg aaagtggcac cagaggtgct 180 

tctgcctgtg cagcgggtgc tgctggtaac agtggctgct tctctctctc tctctctttt 24 0 

ttgggggctc atttttgctg ttttgattcc cgggcttacc aggtgagaag tgagggagga 30 0 

agaaggcagt gtcccttttg ctagagctga cagctttgtt cgcgtgggca gagccttcca 360 

cagtgaatgt gtctggacct catgttgttg aggctgtcac agtcctgagt gtggacttgg 420 

caggtgcctg ttgaatctga gctgcaggtt ccttatctgt cacacctgtg cctcctcaga 480 

ggacagtttt tttgttgttg tgtttttttg tttttttttt ttggtagatg catgacttgt 540 

gtgtgatgag agaatggaga cagagtccct ggctcctcta ctgtttaaca acatggcttt 600 

cttattttgt ttgaattgtt aattcacaga atagcacaaa ctacaattaa aactaagcac 660 

aaagccattc taagtcattg gggaaacggg gtgaacttca ggtggatgag gagacagaat 720 

agagtgatag gaagcgtctg gcagatactc cttttgccac tgctgtgtga ttagacaggc 780 
ccagtgagcc gcggggcaca tgctggccgc tcctccctca gaaaaaggca gtggcctaaa 
tcctttttaa atgacttggc tcgatgctgt gggggactgg ctgggctgct gcaggccgtg 
tgtctgtcag cccaaccttc acatctgtca cgttctccac acgggggaga gacgcagtcc 
gcccaggtcc ccgctttctt tggaggcagc agctcccgca gggctgaagt ctggcgtaag 



840 

900 
960 
1020 
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atgatggatt tgattcgccc tcctccctgt catagagctg cagggtggat 
tcgctggaaa cctctggagg tcatctcggc tgttcctgag aaataaaaag 
c 

<210> 23 

<211> 247 

<212> DNA 

<213> Homo sapiens 

<400> 23 



ccccggcgca 


gcgcggccgc 


agcagcctcc 


gccccccgca 


cggtgtgagc 


gcccgacgcg 


60 


gccgaggcgg 


ccggagtccc 


gagctagccc 


cggcggccgc 


cgccgcccag 


accggacgac 


120 


aggccacctc 


gtcggcgtcc 


gcccgagtcc 


ccgcctcgcc 


gccaacgcca 


caaccaccgc 


180 


gcacggcccc 


ctgactccgt 


ccagtattga 


tcgggagagc 


cggagcgagc 


tcttcgggga 


240 


gcagcag 












247 



<210> 24 

<211> 171S 

<212> DNA 

<213> Homo sapiens 

<400> 24 



tgaccacgga 


ggatagtatg 


agccctaaaa 


atccagactc 


tttcgatacc 


caggaccaag 


60 


ccacagcagg 


tcctccatcc 


caacagccat 


gcccgcatta gctcttagac 


ccacagactg 


120 


gttttgcaac 


gtttacaccg 


actagccagg 


aagtacttcc 


acctcgggca 


cattttggga 


180 


agttgcattc 


ctttgtcttc 


aaactgtgaa gcatttacag aaacgcatcc 


agcaagaata 


240 


ttgtcccttt 


gagcagaaat 


ttatctttca 


aagaggtata 


tttgaaaaaa 


aaaaaaaaag 


300 


tatatgtgag 


gatttttatt 


gattggggat 


cttggagttt 


ttcattgtcg 


ctattgattt 


360 


ttacttcaat 


gggctcttcc 


aacaaggaag 


aagcttgctg 


gtagcacttg 


ctaccctgag 


420 


ttcatccagg 


cccaactgtg 


agcaaggagc 


acaagccaca 


agtcttccag 


aggatgcttg 


480 


attccagtgg 


ttctgcttca 


aggcttccac 


tgcaaaacac 


taaagatcca 


agaaggcctt 


540 


catggcccca 


gcaggccgga 


tcggtactgt 


atcaagtcat 


ggcaggtaca 


gtaggataag 


600 


ccactctgtc 


ccttcctggg 


caaagaagaa 


acggagggga 


tgaattcttc 


cttagactta 


660 


cttttgtaaa 


aatgtcccca 


cggtacttac 


tccccactga 


tggaccagtg 


gtttccagtc 


720 


atgagcgtta 


gactgacttg 


tttgtcttcc 


attccattgt 


tttgaaactc 


agtatgccgc 


780 


ccctgtchtg 


ctgtcatgaa 


atcagcaaga gaggatgaca 


catcaaataa 


taactcggat 


840 


tccagcccac 


attggattca 


tcagcatttg gaccaatagc 


ccacagctga 


gaatgtggaa 


900 


tacctaagga 


taacaccgct 


tttgttctcg caaaaacgta 


tctcctaatt 


tgaggctcag 


960 
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atgaaatgca 


tcaggtcctt 


tggggcatag 


atcagaagac 


tacaaaaatg 


aagctgctct 


1020 


gaaatctcct 


ttagccatca 


ccccaacccc 


ccaaaattag tttgtgttac 


ttatggaaga 


1080 


tagttttctc 


cttttacttc 


acttcaaaag 


ctttttactc 


aaagagtata 


tgttccctcc 


1140 


aggtcagctg 


cccccaaacc 


ccctccttac 


gctttgtcac 


acaaaaagtg 


tctctgcctt 


1200 


gagtcatcta 


ttcaagcact 


tacagctctg 


gccacaacag 


ggcattttac 


aggtgcgaat 


1260 


gacagtagca 


ttatgagtag tgtgaattca ggtagtaaat 


atgaaactag 


ggtttgaaat 


1320 


tgataatgct 


ttcacaacat 


ttgcagatgt 


tttagaagga 


aaaaagttcc 


ttcctaaaat 


1380 


aatttctcta 


caattggaag 


attggaagat 


tcagctagtt 


aggagcccat 


tttttcctaa 


1440 


tctgtgtgtg 


ccctgtaacc 


tgactggtta 


acagcagtcc 


tttgtaaaca gtgttttaaa 


1500 


ctctcctagt 


caatatccac 


cccatccaat 


ttatcaagga agaaatggtt 


cagaaaatat 


1560 


tttcagccta 


cagttatgtt 


cagtcacaca 


cacatacaaa 


atgttccttt 


tgcttttaaa 


1620 


gtaatttttg 


actcccagat 


cagtcagagc 


ccctacagca 


ttgttaagaa 


agtatttgat 


1680 


ttttgtctca 


atgaaaataa 


aachatattc 


atttcc 






1716 



<210> 25 

<211> 160 

<212> DNA 

<213> Homo sapiens 

<400> 25 

tataaaagct gggccggcgc gggccgggcc attcgcgacc cggaggtgcg cgggcgcggg 60 

cgagcagggt ctccgggtgg gcggcgcgac gccccgcgca ggctggaggc cgccgaggct 120 

cgccatgccg ggagaactct aactccccca tggagtcggc 160 

<210> 26 

<211> 1306 

<212> DNA 

<213> Homo sapiens 

<400> 26 

tgaggcgcgc ggctgtggga ccgccctggg ccagcctccg gcggggaccc agggagtggt 60 

ttggggtcgc cggatctcga ggcttgccca gaccgtgcga gccaggacta ggagattccg 120 

gtgcctcctg aaagcctggc ctgctccgcg tgtcccctcc cttcctctgc gccggacttg 180 

gtgcgtctaa gatgaggggg ccaggcggtg gcttctccct gcgaggaggg gagaattctt 240 

ggggctgagc tgggagcccg gcaactctag tatttaggat aacttgtgcc ttggaaatgc 3 00 

aaactcaccg ctccaatgcc tactgagtag ggggagcaaa tcgtgccttg tcattttatt 3 60 

tggaggtttc chgcctcctt cccgaggcta cagcagaccc ccatgagaga aggaggggag 420 
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caggcccgtg gaggaggggg gctcagggag ctgagatccc gacaagcccg ccagccccag 480 

ccgctcctcc acgcctgtcc ttagaaaggg gtggaaacat agggacttgg ggcttggaac 540 

ctaaggttgt tccctagttc tacatgaagg tggaggtctc tagthccacg cctctcccac 600 

ctccctccgc acacacccca cccagcctgc tataggctgg ctttcccttg gggctggaac 660 

tcactgcgat ggggtcacca ggtgaccagt ggagccccca ccccgagtca gaccagaaag 720 

ctaggtcgtg ggtcagctct gaggatgtat acccctggtg ggagagggag acctagagat 780 

ctggctgtgg ggcgggcatg gggggtgaag ggccactggg accctcagcc ttgtttgtac 840 

tgtatgcctt cagcattgcc taggaacacg aagcacgatc agtccatcca gagggaccgg 900 

agttatgaca agcttcccaa atattttgct ttatcagccg atatcaacac ttgtatctgg 960 

cctctgtgcc cagcagtgcc ttgtgcaatg tgaatgtacc gtctctgcta aaccaccatt 102 0 

ttatttggtt ttgttttgtt tggttttctc ggatacttgc caaaatgaga ctctccgtcg 108 0 

gcagctgggg gaagggtctg agactctctt tccttttggt tttgggatta cttttgatcc 1140 

tgggggacca atgaggtgag gggggttctc ctttgccctc agctttccca gccctccggc 1200 

ctgggctgcc cacaaggctt ctcccccaga ggccctggct cctggtcggg aagggaggtg 1260 

cctcccgcca acgcatcact ggggctggga gcagggaagg gaattc 1305 

<210> 27 

<211> 216 

<212> DNA 

<213> Homo sapiens 

<400> 27 

agcgagagcg cccccgagca gcgcccgcgc cctccgcgcc ttctccgccg ggacctcgag 60 

cgaaagacgc ccgcccgccg cccagccctc gcctccctgc ccaccgggca caccgcgccg 120 

ccaccccgac cccgctgcgc acggcctgtc cgctgcacac cagcttgttg gcgtcttcgt 180 

cgccgcgctc gccccgggct actcctgcgc gccaca 216 

<210> 28 

<211> 687 

<212> DNA 

<213> Homo sapiens 

<400> 28 

taaatgctac ctgggtttcc agggcacacc tagacaaaca rgggagaaga gtgtcagaat 60 

cagaatcatg gagaaaatgg gcgggggtgg tgtgggtgat gggactcatt gtagaaagga 120 

agccttgctc attcttgagg agcattaagg tatttcgaaa ctgccaaggg tgctggtgcg 180 

gatggacact aatgcagcca cgattggaga atactttgct tcatagtatt ggagcacatg 240 

ttactgcttc attttggagc ttgtggagtt gatgactttc tgttttctgt ttgtaaatta 3 00 
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tttgctaagc atattttctc taggcttttt tccttttggg gttctacagt cgtaaaagag 360 

ataataagat tagttggaca gtttaaagct tttattcgtc ctttgacaaa agtaaatggg 420 

agggcattcc atcccttcct gaagggggac actccatgag tgtctgtgag aggcagctat 480 

ctgcactcta aactgcaaac agaaatcagg tgttttaaga ctgaatgttt tatttatcaa 540 

aatgtagctt ttggggaggg aggggaaatg taatactgga ataatttgta aatgatttta 600 

attttatatt cagtgaaaag attttattta tggaattaac catttaataa agaaatattt 660 

acctaaaaaa aaaaaaaaaa aaaaaaa 687 

<210> 29 

<211> 310 

<212> DMA 

<213> Homo sapiens 

<400> 29 

cggccccaga aaacccgagc gagtaggggg cggcgcgcag gagggaggag aactgggggc 60 

gcgggaggct ggtgggtgtc gggggtggag atgtagaaga tgtgacgccg cggcccggcg 120 

ggtgccagat tagcggacgg ctgcccgcgg ttgcaacggg atcccgggcg ctgcagcttg 180 

ggaggcggct ctccccaggc ggcgtccgcg gagacaccca tccgtgaacc ccaggtcccg 240 

ggccgccggc tcgccgcgca ccaggggccg gcggacagaa gagcggccga gcggctcgag 300 

gctgggggac 310 

<210> 30 

<211> 5882 

<212> DNA 

<213> Homo sapiens 

<400> 30 

ctgctaagag ctgattttaa tggccacatc taatctcatt tcacatgaaa gaagaagtat 60 

attttagaaa tttgttaatg agagtaaaag aaaataaatg tgtatagctc agtttggata 120 

attggtcaaa caatttttta tccagtagta aaatatgtaa ccattgtccc agtaaagaaa 180 

aataacaaaa gttgtaaaat gtatattctc ccttttatat tgcatctgct gttacccagt 240 

gaagcttacc tagagcaatg atctttttca cgcatttgct ttattcgaaa agaggctttt 3 00 

aaaatgtgca tgtttagaaa caaaatttct tcatggaaat catatacatt agaaaatcac 3 60 

agtcagatgt ttaatcaatc caaaatgtcc actatttctt atgtcattcg ttagtctaca 42 0 

tgtttctaaa catataaatg tgaatttaat caattccttt catagtttta taattctctg 4 80 

gcagttcctt atgatagagt ttataaaaca gtcctgtgta aactgctgga agttcttcca 540 
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cagtcaggtc 


aattttgtca 


aacccttctc 


tgctggtgat 


9993.gttgta 


ttttcagtct 


atcttaagca 


ttcttcctgg 


caaaaattta 


tgatatacat 


atctgacttc 


ccaaaagctc 


ggacggacct 


gaattctgat 


t ttat ac c ag 


ctcctacgta 


aaaaaagaga 


tg'tacaaatc 


atcaaagatt 


ttcagttaaa 


gtagcattat 


taaagtttcc 


aatacaaatt 


ctttgccttg 


taccactgta 


aattcaagaa 


gc t tt tgaaa 


ggcttatcta 


cctgtacatt 


tttggggtca 


ccaaaaggta 


aaaatataga 


ttgaaaagtfc 


tttcttgaga 


t aagattcca 


aagaacttag 


gtgtttgatc 


agttttcaag 


aaacttggaa 


4-^ ^.^m,>^4-4— 4-4- 

ttcacatttt 


ataaggttga 


4-4-4-l-4-^'3 •^4-+' 

tCtcticaaTiu 


tgccatliaac 


-^^-i^^4-4-+> f^4- 

anattcttgc 


ggctgctttfc 


gctttctcta 


-1 4- 4- 4- 4- ^4- r^-^ 4- 

a c u ii Ti gti ga u. 


gttc'bg'bcat 


Izaaaaagctg 


ccttccticta 


ccacbt-tgcb 


4- 4- +- 4- 4- .--r4- 4- ^ y-i 

uutttgtccc 


aatactcgtit 


ttgccbcbat 


ttgccc'k'tgc 


agtaattcta 


ctzggtgaaaa 


tg t c b c aat t 


ccca'kg'bgc'k 


gbgacbgizag 


ccctggatat 


gctcttigt-Ct 


tttccctcta 


caaugcccua 


aaacataagg 


cat beat ctg 


acaattgaga 


tttgcccat a 


ggttaaacat 


aatctaggcc 


gggtgcagtg 


gctcatgcct 


ggagga t cgc 


ttgagcccag 


gagt t caaga 


aaacacaaaa 


aatagccagg 


catggtggcg 


tgaggfcggga 


gggttgatca 


cttgaggctg 


tgccactgca 


gtccagccta 


ggcaacagag 


tccttaataa 


gaaaagtaat 


ttttactctg 


atttaagatg 


gtagcactag 


tcttaaattg 


ccatttttat 


tcattatgct 


ttgaaaaata 



tgtacccata cagcagcagc ctagcaactc 600 

tcgccaggtc attgagatcc atccactcac 660 

tggtgaatga atatggcttt aggcggcaga 720 

caggatttgt gtgctgttgc cgaatactca 780 

tctcttcaaa aacttctcga accgctgtgt 840 

aataataatt acacttttag aaactgtatc 900 

gtaaaggctc aaaacattac cctaacaaag 960 

tggatatcaa gaaatcccaa aatattttct 1020 

tgctgaatat ttctttggct gctacttgga 1080 

gctcttttta acttcttgct gctctttttc 1140 

aaaacatttt gcatggctgc agttcctttg 1200 

attcatttct tcaacaccga aatgctggag 1260 

tataaataat tttataattc aacaaaggtt 1320 

aaatgcaaat ttgtgtggca ggatttttat 13 80 

tctacacatc cagatggtcc ctctaactgg 1440 

tgtctcccaa agtatttagg agaagccctt 1500 

ggaaagcttc acaattgtca cagacaaaga 15 60 

ttttcttght tgtcaaatag taaatgatat 1620 

acatgcaaag aagaggaagt cacagaaaca 1680 

actgtcttac catagactgt cttacccatc 1740 

atagctatgg aaagatgcat agaaagagta 1800 

ccatttttca attacatgct gacttccctt 1850 

ggttagaaac aactgaaagc ataaaagaaa 1920 

atattccctg cactttggga ggccaaagca 1980 

ccaacctggt gaaaccccgt ctctacaaaa 2040 

tgtacatgtg gtctcagata cttgggaggc 2100 

agaggtcaag gttgcagtga gccataatcg 2160 

tgagactttg tctcaaaaaa agagaaattt 2220 

atgtgcaata catttgttat taaatttatt 2280 

tataaaatat cccctaacat gtttaaatgt 2340 

attatgggga aatacatgtt tgttattaaa 2400 
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tttattatta aagatagtag cactagtctt aaatttgata taacatctcc taacttgttt 2460 

aaatgtccat ttttattctt tatgcttgaa aataaattat ggggatccta tttagctctt 2520 

agtaccacta atcaaaagtt cggcatgtag ctcatgatct atgctgtttc tatgtcgtgg 2580 

aagcaccgga tgggggtagt gagcaaatct gccctgctca gcagtcacca tagcagctga 2640 

ctgaaaatca gcactgcctg agtagttttg atcagtttaa cttgaatcac taactgactg 2700 

aaaattgaat gggcaaataa gtgcttttgt ctccagagta tgcgggagac ccttccacct 2760 

caagatggat atttcttccc caaggatttc aagatgaatt gaaattttta atcaagatag 2820 

tgtgctttat tctgttgtat tttttattat tttaatatac tgtaagccaa actgaaataa 2880 

catttgctgt tttataggtt tgaagaacat aggaaaaact aagaggtttt gtttttattt 2940 

ttgctgatga agagatatgt ttaaatatgt tgtattgttt tgtttagtta caggacaata 3000 

atgaaatgga gtttatattt gttatttcta ttttgttata tttaataata gaattagatt 3060 

gaaataaaat ataatgggaa ataatctgca gaatgtgggt ttcctggtgt ttcctctgac 3120 

tctagtgcac tgatgatctc tgataaggct cagctgcttt atagttctct ggctaatgca 3180 

gcagatactc ttcctgccag tggtaatacg attttttaag aaggcagttt gtcaatttta 3240 

atcttgtgga tacctttata ctcttagggt attattttat acaaaagcct tgaggattgc 33 00 

attctatttt ctatatgacc ctcttgatat ttaaaaaaca ctatggataa caattcttca 3360 

tttacctagt attatgaaag aatgaaggag ttcaaacaaa tgtgbttccc agttaactag 3420 

ggtttactgt ttgagccaat ataaatgttt aactgtttgt gatggcagta ttcctaaagt 3480 

acattgcatg ttttcctaaa tacagagttt aaataatttc agtaattctt agatgattca 3540 

gcttcatcat taagaatatc ttttgtttta tgttgagtta gaaatgcctt catatagaca 3600 

tagtctttca gacctctact gtcagttttc atttctagct gctttcaggg ttttatgaat 3660 

tttcaggcaa agctttaatt tatactaagc ttaggaagta tggctaatgc caacggcagt 3720 

ttttttcttc ttaattccac atgactgagg catatatgat ctctgggtag gtgagttgtt 3780 

gtgacaacca caagcacttt thtttttttt aaagaaaaaa aggtagtgaa tttttaatca 3840 

tctggacttt aagaaggatt ctggagtata cttaggcctg aaattatata tatttggctt 3900 

ggaaatgtgt ttttcttcaa ttacatctac aagtaagtac agctgaaatt cagaggaccc 3960 

ataagagttc acatgaaaaa aatcaattca tttgaaaagg caagatgcag gagagaggaa 4020 

gccttgcaaa cctgcagact gctttttgcc caatatagat tgggtaaggc tgcaaaacat 4080 

aagcttaatt agctcacatg ctctgctctc acgtggcacc agtggatagt gtgagagaat 414 0 

taggctgtag aacaaatggc cttctctttc agcattcaca ccactacaaa atcatctttt 42 00 
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atatcaacag 


aagaataagc 


ataaactaag 


caaaaggtca 


ataagtacct 


gaaaccaaga 


4250 


ttggctagag 


atatatctta 


atgcaatcca 


ttttctgatg gattgttacg agttggctat 


4320 


ataatgtatg tatggtattt tgatttgtgt aaaagtttta 


aaaatcaagc 


tttaagtaca 


4380 


tggacatttt 


taaataaaat atttaaagac aatttagaaa attgccttaa tatcattgtt 


4440 


ggctaaatag 


aataggggac 


atgcatatta 


aggaaaaggt 


catggagaaa 


taatattggt 


4500 


atcaaacaaa 


tacattgatt 


tgtcatgata 


cacattgaat 


ttgatccaat 


agtttaagga 


4560 


ataggtagga 


aaatttggtt 


tctatttttc 


gatttcctgt 


aaatcagtga 


cataaataat 


4620 


tcttagctta 


ttttatattt 


ccttgtctta 


aatactgagc 


tcagtaagtt 


gtgttagggg 


4680 


attatttctc 


agttgagact 


ttcttatatg 


acattttact 


atgttttgac 


ttcctgacta 


4740 


ttaaaaataa 


atagtagaaa 


caattttcat 


aaagtgaaga 


attatataat 


cactgcttta 


4800 


taactgactt 


tattatattt 


atttcaaagt 


tcatttaaag 


gctactattc 


atcctctgtg 


4860 


atggaatggt 


caggaatttg 


ttttctcata 


gtttaattcc 


aacaacaata 


ttagtcgtat 


4920 


ccaaaataac 


ctttaatgch 


aaactttact 


gatgtatatc 


caaagcttct 


ccttttcaga 


4980 


cagattaatc 


cagaagcagt 


cataaacaga 


agaataggtg gtatgttcct 


aatgatatta 


5040 


tttctactaa 


tggaataaac 


tgtaatatta 


gaaattatgc 


tgctaattat 


atcagctctg 


5100 


aggtaatttc 


tgaaatgttc 


agactcagtc 


ggaacaaatt 


ggaaaattta 


aatttttatt 


5160 


cttagctata aagcaagaaa gtaaacacat 


taatttcctc 


aacattttta 


agccaattaa 


5220 


aaatataaaa 


gatacacacc 


aatatcttct 


tcaggctctg 


acaggcctcc 


tggaaacttc 


5280 


cacatatttt 


tcaactgcag 


tataaagtca 


gaaaataaag 


ttaacataac 


tttcactaac 


5340 


acacacatat 


gtagatttca 


caaaatccac 


ctataattgg 


tcaaagtggt 


tgagaatata 


5400 


ttttttagta 


attgcatgca 


aaatttttct 


agcttccatc 


ctttctccct 


cgtttcttct 


5460 


ttttttgggg 


gagctggtaa 


ctgatgaaat 


cttttcccac 


cttttctctt 


caggaaatat 


5520 


aagtggtttt 


gtttggttaa 


cgtgatacat 


tctgtatgaa 


tgaaacattg 


gagggaaaca 


5580 


tctactgaat 


ttctgtaatt 


taaaatattt 


tgctgctagt 


taactatgaa 


cagatagaag 


5640 


aatcttacag 


atgctgctat 


aaataagtag 


aaaatataaa 


tttcatcact 


aaaatatgct 


5700 


attttaaaat 


ctatttccta 


tattgtattt 


ctaatcagat 


gtattactct 


tattatttct 


5760 


attgtatgtg 


ttaatgattt 


tatgtaaaaa 


tgtaattgct 


tttcatgagt 


agtatgaata 


5820 


aaattgatta gtttgtgttt 


tcttgtctcc 


cgaaaaaaaa 


aaaaaaaaaa 


aaaaaaaaaa 


5880 


aa 












5882 



<210> 31 
<211> 310 
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<212> DNA 

<213> Homo sapiens 

<400> 31 

cggccccaga aaacccgagc gagtaggggg cggcgcgcag gagggaggag aactgggggc 60 

gcgggaggct ggtgggtgtc gggggtggag atgtagaaga tgtgacgccg cggcccggcg 120 

ggtgccagat tagcggacgg ctgcccgcgg ttgcaacggg atcccgggcg ctgcagcttg 180 

ggaggcggct ctccccaggc ggcgtccgcg gagacaccca tccgtgaacc ccaggtcccg 240 

ggccgccggc tcgccgcgca ccaggggccg gcggacagaa gagcggccga gcggctcgag 300 
gctgggggac 



310 



<210> 32 

<211> 3212 

<212> DNA 

<213> Homo sapiens 

<400> 32 

tgagggcgcc aggcaggcgg gcgccaccgc cacccgcagc gagggcggag ccggccccag 60 

gtgctcccct gacagtccct cctctccgga gcattttgat accagaaggg aaagcttcat 12 0 

tctccttgtt gttggttgtt ttttcctttg ctctttcccc cttccatctc tgacttaagc 180 

aaaagaaaaa gattacccaa aaactgtctt taaaagagag agagagaaaa aaaaaatagt 240 

atttgcataa ccctgagcgg tgggggagga gggttgtgct acagatgata gaggatttta 300 

taccccaata atcaactcgt ttttatatta atgtacttgt ttctctgttg taagaatagg 360 

cattaacaca aaggaggcgt ctcgggagag gattaggttc catcctttac gtgtttaaaa 42 0 

aaaagcataa aaacatttta aaaacataga aaaattcagc aaaccatttt taaagtagaa 480 

gagggtttta ggtagaaaaa catattcttg tgcttttcct gataaagcac agctgtagtg 540 

gggttctagg catctctgta ctttgcttgc hcatatgcat gtagtcactt tataagtcat 600 

tgtatgttat tatattccgt aggtagatgt gtaacctctt caccttattc atggctgaag 660 

tcacctcttg gttacagtag cgtagcgtgg ccgtgtgcat gtcctttgcg cctgtgacca 720 

ccaccccaac aaaccatcca gtgacaaacc atccagtgga ggtttgtcgg gcaccagcca 780 

gcgtagcagg gtcgggaaag gccacctgtc ccactcctac gatacgctac tataaagaga 840 

agacgaaata gtgacataat atattctatt tttatactct tcctattttt gtagtgacct 900 
gfcttatgaga tgctggtttt ctacccaacg gccctgcagc cagctcacgt ccaggttcaa 
cccacagcta cttggtttgt gttcttcttc atattctaaa accattccat ttccaagcac 



960 
1020 



tttcagtcca ataggtgtag gaaatagcgc tgtttttgtt gtgtgtgcag ggagggcagt 1080 
tttctaatgg aatggtttgg gaatatccat gtacttgttt gcaagcagga ctttgaggca 1140 
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agtgtgggcc actgtggtgg cagtggaggt ggggtgtttg ggaggctgcg tgccagtcaa 1200 

gaagaaaaag gtttgcattc tcacattgcc aggatgataa gttcctttcc ttttctttaa 12 60 

agaagttgaa gtttaggaat cctttggtgc caactggtgt ttgaaagtag ggacctcaga 1320 

ggtttaccta gagaacaggt ggtttttaag ggttatctta gatgtttcac accggaaggt 13 80 

ttttaaacac taaaatatat aatttatagt taaggctaaa aagtatattt attgcagagg 1440 

atgttcataa ggccagtatg atttataaat gcaatctccc cttgatttaa acacacagat 15 0 0 

acacacacac acacacacac acacacaaac cttctgcctt tgatgttaca gatttaatac 1560 

agtttatttt taaagataga tccttttata ggtgagaaaa aaacaatctg gaagaaaaaa 1620 

accacacaaa gacattgatt cagcctgttt ggcgtttccc agagtcatct gattggacag 1680 

gcatgggtgc aaggaaaatt agggtactca acctaagttc ggttccgatg aattcttatc 1740 

cccbgcccct tcctttaaaa aacttagtga caaaatagac aatttgcaca tcttggctat 1800 

gtaattcttg taatttttat ttaggaagtg ttgaagggag gtggcaagag tgtggaggct 1860 

gacgtgtgag ggaggacagg cgggaggagg tgtgaggagg aggctcccga ggggaagggg 1920 

cggtgcccac accggggaca ggccgcagct ccattttctt attgcgctgc taccgttgac 1980 

ttccaggcac ggtttggaaa tattcacatc gcttctgtgt atctctttca cattgtttgc 2040 

tgctattgga ggatcagttt tttgttttac aatgtcatat actgccatgt actagtttta 2100 

gttttctctt agaacattgt attacagatg ccttttttgt agtttttttt ttttttatgt 2150 

gatcaatttt gacttaatgt gattactgct ctattccaaa aaggttgctg tttcacaata 2220 

Gctcatgctt cacttagcca tggtggaccc agcgggcagg ttctgcctgc tttggcgggc 22 80 

agacacgcgg gcgcgatccc acacaggctg gcgggggccg gccccgaggc cgcgtgcgtg 2340 

agaaccgcgc cggtgtcccc agagaccagg ctgtgtccct cttctcttcc ctgcgcctgt 2400 

gatgctgggc acttcatctg atcgggggcg tagcatcata gtagttttta cagctgtgtt 2460 

attctttgcg tgtagctatg gaagttgcat aattattatt attattatta taacaagtgt 2520 

gtcttacgtg ccaccacggc gttgtacctg taggactctc attcgggatg attggaatag 2580 

cttctggaat ttgttcaagt tttgggtatg tttaatctgt tatgtactag tgttctgttt 2640 

gttattgttt tgttaattac accataatgc taatttaaag agactccaaa tctcaatgaa 2700 

gccagctcac agtgctgtgt gccccggtca cctagcaagc tgccgaacca aaagaatttg 2760 

caccccgctg cgggcccacg tggttggggc cctgccctgg cagggtcabc ctgtgctcgg 2820 

aggccatctc gggcacaggc ccaccccgcc ccacccctcc agaacacggc tcacgcttac 2 880 

ctcaaccatc ctggctgcgg cgtctgtctg aaccacgcgg gggccttgag ggacgctttg 2940 
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tctgtcgtga tggggcaagg gcacaagtcc tggatgttgt gtgtatcgag aggccaaagg 30 00 

ctggtggcaa gtgcacgggg cacagcggag tctgtcctgt gacgcgcaag tctgagggtc 3060 

tgggcggcgg gcggctgggt ctgtgcattt ctggttgcac cgcggcgctt cccagcacca 3120 

acatgtaacc ggcatgtttc cagcagaaga caaaaagaca aacatgaaag tctagaaata 3180 

aaactggtaa aaccccaaaa aaaaaaaaaa aa 3212 



<210> 33 

<211> 1043 

<212> DMA 

<213> Homo sapiens 



<220> 

<221> misc_feature 
<222> (409) . . (444) 
<223> n = a, t, g or c 

<400> 33 

gcaccgcggc gagcttggct gcttctgggg cctgtgtggc cctgtgtgtc ggaaagatgg 60 

agcaagaagc cgagcccgag gggcggccgc gacccctctg accgagatcc tgctgctttc 120 

gcagccagga gcaccgtccc tccccggatt agtgcgtacg agcgcccagt gccctggccc 180 

ggagagtgga atgatccccg aggcccaggg cgtcgtgctt ccgcgcgccc cgtgaaggaa 240 

actggggagt cttgagggac ccccgactcc aagcgcgaaa accccggatg gtgaggagca 300 

ggtactggcc cggcagcgag cggtcacttt tgggtctggg ctctgacggt gtcccctcta 360 

tcgctggttc ccagcctctg cccgttcgca gcctttgtgc ggttcgtgnc tgggggctcg 420 

gggcgcgggg cgcggggcat gggncacgtg gctttgcgga ggttttgttg gactggggct 480 

agacagtccc cgccagggag gagggcggga tttcggacgg ctctcgcggc ggtgggggtg 540 

ggggtggttc ggaggtctcc gcgggagttc agggtaaagg tcacggggcc ggggctgcgg 600 

gccgcttcgg cgcgggaggt ccggatgatc gcagtgcctg tcgggtcact agtgtgaacg 660 



720 
780 



ctgcgcgtag tctgggcggg attgggccgg ttcagtgggc aggttgactc agcttttcct 
cttgagctgg tcaagttcag acacgttccg aaactgcagt aaaaggagtt aagtcctgac 

ttgtctccag ctggggctat ttaaaccatg cattttccca gctgtgttca gtggcgattg 840 

gagggtagac ctgtgggcac ggacgcacgc cactttttct ctgctgatcc aggtaagcac 900 

cgacttgctt gtagctttag ttttaactgt tgtttatgtt ctttatatat gatgtatttt 960 

ccacagatgt ttcatgattt ccagttttca tcgtgtcttt tttttccttg taggcaaatg 1020 

tgcaatacca acatgtctgt acc 1043 
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480 
540 



<210> 34 

<211> 1153 

<212> DNA 

<213> Homo sapiens 

<400> 34 

tagttgacct gtctataaga gaattatata tttctaacta tataacccta ggaatttaga 60 

caacctgaaa tttattcaca tatatcaaag tgagaaaatg cctcaattca catagatttc 12 0 

ttctctttag tataattgac ctactttggt agtggaatag tgaatactta ctataatttg 180 

acttgaatat gtagctcatc ctttacacca actcctaatt ttaaataatt tctactctgt 240 

cttaaatgag aagtacttgg tttttttttt cttaaatatg tatatgacat ttaaatgtaa 300 

cttattattt tttttgagac cgagtcttgc tctgttaccc aggctggagt gcagtgggtg 360 

atcttggctc actgcaagct ctgccctccc cgggttcgca ccattctcct gcctcagcct 420 
cccaattagc ttggcctaca gtcatctgcc accacacctg gctaattttt tgtactttta 
gtagagacag ggtttcaccg tgttagccag gatggtctcg atctcctgac ctcgtgatcc 

gcccacctcg gcctcccaaa gtgctgggat tacaggcatg agccaccgtg ctctccagcc 600 

taggcaacag agtgagactc tgtctccaaa aaaaaaaaaa aaaaaagggg actataacac 660 

ccccagggaa agggacaggt gggacattct tattcttaat ttaaataaat tgacagggga 720 

aagttgggcc actcttgagc ttgtgggtgc tcaccaggtt gaccccaaaa aaagaagcct 780 

tccacaaaac attaatttat ttccctaata tacccgcctc tgtgagttaa gggataatgc 840 

atcaggactc ttgcaaccag acaaaattat ttaaaaacgc cacttggggg ggaggcgggt 900 

ccctcctggg gattcgcctt tgtgggagag aaaactgcac agacttgggc aaataatgtt 960 

ttttgtcacc ccaaaacgta ttcgcgagac atttcattag aacgaagctt taccctaata 1020 

ttgaactccc catttaaaca gtttccacac acacttaggg agatttttcc ctctgtgagt 1080 

tccgcagaac aatagttgga cgggaataga accctgaaac actttagttc accacgaact 114 0 

attatagggc ggg 1153 

<210> 35 

<211> 334 

<212> DNA 

<213> Homo sapiens 

<400> 35 

tgactatcca gctctgagag acgggagttt ggagttgccc gctttacttt ggttgggttg 60 

gggg999c:gg cgggctgttt tgttcctttt cttttttaag agttgggttt tcttttttaa 120 

ttatccaaac agtgggcagc ttcctccccc acacccaagt atttgcacaa tatttgtgcg 18 0 

gggtatgggg gtgggttttt aaatctcgtt tctcttggac aagcacaggg atctcgttct 240 
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cctcattttt tgggggtgtg tggggacttc tcaggtcgtg tccccagcct tctctgcagt 300 
cccttctgcc ctgccgggcc cgtcgggagg cgcc 334 

<210> 36 

<211> 543 

<212> DMA 

<213> Homo sapiens 

<400> 36 

tagctcagga ccttggctgg gcctggtcgt catgtaggtc aggaccttgg ctggacctgg 60 

aggccctgcc cagccctgct ctgcccagcc cagcaggggc tccaggcctt ggctggcccc 12 0 

acatcgcctt ttcctccccg acacctccgt gcacttgtgt ccgaggagcg aggagcccct 180 

cgggccctgg gtggcctctg ggccctttct cctgtctccg ccactccctc tggcggcgct 240 

ggccgtggct ctgtctctct gaggtgggtc gggcgccctc tgcccgcccc ctcccacacc 3 00 

agccaggctg gtctcctcta gcctgtttgt tgtggggtgg gggtatattt tgtaaccact 3 60 

gggcccccag cccctctttt gcgacccctt gtcctgacct gttctcggca ccttaaatta 420 

ttagaccccg gggcagfccag gtgctccgga cacccgaagg caataaaaca ggagccgtga 480 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 540 

aaa 543 

<210> 37 

<211> 511 

<212> DNA 

<213> Homo sapiens 

<400> 37 

gctcagcaag gggtccgtcc ttctctgtca ctgtctcttt tgcctgttgt aattctgtct 60 

gcctctctgg gactctgcct gtctcactct ttctgtctgt gcctctcctc actcttgttc 120 

tttctgcctg aatcacagcc ctcagttttt ctgtcctcat gcatttgtct ttgtggctct 180 
ttccgtcttt ctgcccttga caccatcccc tctcccagtg cttcccctct gcttccagat 
cgcttcatga cttaggcagg gaaacagagg tcagggcctc cttccaggct tccctctgca 

tcttactgag tatgcaggtc ggaagagcct cgggtcctgc ctccgcgggt ggcctagagc 3 60 

caaaggaagg cggagcccgt cggggcggga ttggccctta gggccacctc ataaagcctg 420 

gggcgagggg cacaacggcc ttgggaagga gccctgctgg ggccgtccag tcccccagac 480 

ctcacaggct cagtcgcgga tctgcagtgt c 511 



<21G> 38 

<211> 458 

<212> DNA 

<213> Homo sapiens 



240 
300 
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<400> 38 

tagtagggac cagtgaccat cacatccctt caagagtcct gaagatcaag ccagttctcc 60 

ttccctgcag agctttggcc attaccacct gacctcttgc tgccagctaa taagaagtgc 12 0 

caagtggaca gtctggccac tgtcaaggca gggaaggggc catgactttt ctgccctgcc 180 

ctcagcctgt tgccctgcct cccaaacccc attagtctag ccttgtagct gttactgcaa 240 

gtgtttcttc tggcttagtc tgttttctaa agccaggact attccctttc ctccccagga 300 

atatgtgttt tcctttgtct taatcgatct ggtaggggag aaatggcgaa tgtcatacac 360 

atgagatggt atatccttgc gatgtacaga atcagaaggt ggtttgacag catcataaac 42 0 

aggctgactg gcaggaatga aaaaaaaaaa aaaaaaaa 458 

<210> 39 

<211> 270 

<212> DHA 

<213> Homo sapiens 

<400> 39 

ggggccgccg agagccgcag cgccgctcgc ccgccgcccc ccaccccgcc gccccgcccg 60 

gcgaattgcg ccccgcgccc tcccctcgcg cccccgagac aaagaggaga gaaagtttgc 120 

gcggccgagc gggcaggtga ggagggtgag ccgcgcggag gggcccgcct cggccccggc 180 

tcagcccccg cccgcgcccc cagcccgccg ccgcgagcag cgcccggacc ccccagcggc 240 

ggccccgccc gcccagcccc ccggcccgcc 270 



<210> 40 

<211> 751 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<222> (535) . . (739) 

<223> n = a, t, g or c 

<400> 40 

taagcaggcc tccaacgccc ctgtggccaa ctgcaaaaaa agcctccaag ggtttcgact 60 

ggtccagctc tgacatccct tcctggaaac agcatgaata aaacactcat cccatgggtc 120 

caaattaata tgattctgct ccccccttct ccttttagac atgghtgtgg gtctggaggg 180 

agacgtgggt ccaaggtcct catcccatcc tccctctgcc aggcactatg tgtctggggc 240 

ttcgatcctt gggtgcaggc agggctggga cacgcggctt ccctcccagt ccctgccttg 3 00 

gcaccgtcac agatgccaag caggcagcac ttagggatct cccagctggg ttagggcagg 360 

gcctggaaat gtgcattttg cagaaacttt tgagggtcgt tgcaagactg tgtagcaggc 420 
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ctaccaggtc ccttfccatct tgagagggac atggcccctt gttttctgca gcttccacgc 480 

ctctgcactc cctgcccctg gcaagtgctc ccatcgcccc cggtgcccac catgnagctc 540 

cccgcacctg actcccccca catccaaggg cagccctgga accagtgggc tagttccttg 600 

aaggaagccc cactcattcc tattaatccc tcagaattcc cggggggagc cttccctcct 660 

gaaccttggt aaaaaatggg gaacgagaaa aacccccgct tggagctgtg cgtttccagc 720 

ccctacttga gagncttttt tttgggggcc g 751 

<210> 41 

<211> 229 

<212> DNA 

<213> Homo sapiens 

<400> 41 

cgcgccgggc ccggctcggc ccgacccggc tccgcgcggg caggcggggc ccagcgcact 60 

cggagcccga gcccgagccg cagccgccgc ctggggcgct tgggtcggcc tcgaggacac 120 

cggagagggg cgccacgccg ccgtggccgc agatttgaaa gaagccgaca ctaaaccacc , 180 

aatatacaac aaggccattt tgtcaaacga gagtcagcct ttaacgaaa 229 

<210> 42 

<211> 233 

<212> DNA 

<213> Homo sapiens 

<400> 42 

tagcagagag tcctgagcca ctgccaacat ttcccttctt ccagttgcac tattctgagg 60 

gaaaatctga cacctaagaa atttactgtg aaaaagcatt ttaaaaagaa aaggttttag 120 

aatatgatct attttatgca tattgtttat aaagacacat ttacaattta cttttaatat 180 

taaaaattac catattatga aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 233 

<210> 43 

<211> 349 

<212> DNA 

<213> Homo sapiens 

<400> 43 

ggcacgaggg gcgagaggaa gcagggagga gagtgatttg agtagaaaag aaacacagca SO 

ttccaggctg gccccacctc tatattgata agtagccaat gggagcgggt agccctgatc 120 

cctggccaat ggaaactgag gtaggcgggt catcgcgctg gggtctgtag tctgagcgct 180 

acccggttgc tgctgcccaa ggaccgcgga gtcggacgca ggcagaccat gtggaccctg 240 

gtgagctggg tggccttaac agcagggctg gtggctggaa cgcggtgccc agatggtcag 3 00 

ttctgccctg tggcctgctg cctggacccc ggaggagcca gctacagct 349 
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<210> 44 

<211> 337 

<212> DNA 

<213> Homo sapiens 



<400> 44 








60 


tgagggacag 


tactgaagac tctgcagccc 


tcgggaccGc actcggaggg 


tgccctctgc 


tcaggcctcc 


ctagcacctc cccctaacca 


aattctccct ggaccccatt 


ctgagctccc 


120 


catcaccatg ggaggtgggg cctcaatcta 


aggccttccc tgtcagaagg gggttgtggc 


T Q A 

XoU 


aaaagccaca 


ttacaagctg ccatcccctc 


cccgtttcag tggaccctgt ggccaggtgc 


240 


ttttccctat 


ccacaggggt gtttgtgtgt 


gtgcgcgtgt gcgtttcaat 


aaagtttgta 


300 


cactttcaaa 


aaaaaaaaaa aaaaaaaaaa 


aaaaaaa 




337 


<210> 45 










<211> 1700 








<212> DNA 










<213> Homo sapiens 








<400> 45 










tgtttgcatt 


aagttcatag attataattt 


gtaatggaat caacaccaaa 


tgcaaattag 


60 


aaagagagcc 


cactttgctc acccagtcac 


gtcttcccat gtaaccatag 


aacgttgggg 


120 


tcctgtgtct 


ttctagatcc acagtcttgc 


tctcagaaca ggctagccac 


accacaggcc 


180 


tagtgccagg 


acccatggcc tttttttaag 


ctcagactcc cttctgtgaa 


cagcaatatc 


240 


cccacaactt 


gtacaacatt ggtgcttcct 


gcaagggcta cagaactatt 


tgatacgaaa 


300 


atgttcattg 


acttacacac aagagaagca 


caaaataaaa aattaataat 


taatttaatg 


360 


tctttgaaaa 


tgtaccattt atttttacat 


ttggggtcat aagaattgta 


ttacacttaa 


420 


gaatgcaata 


caatttgaag atcagatttt 


tctccctttg tgagaatttc 


tcagtatgtg 


480 


tgatgactac 


caagaaatca tagccagtca taaattcagt gagttactca taaacgaaca 


540 


agaaccacct 


acttcttggg gaggtaggtc 


tgcttccctt caactcagga 


tacaactgct 


600 


ttcaactgct 


ttcttcacat tagctgacta 


attagctaga agcctgtcgt 


aaacaatttt 


660 


atggttgact 


ccttccctgg gctcagggtt 


ccctagaaca gagaggtccc 


caaatcccgg 


720 


tctgtggcct 


gtccgcctaa gctctgcctc 


ctgccagatc agcaggcagc 


attagattct 


780 


cataggagct 


ggacgcctat tgtgaactgc gcatgtgcgg gatccagatt gtgcactctt 


840 


tatgagaatc 


taactaatgc ttgatgatct 


atctgaacca gaacaatttc 


atcctgaaac 


900 


catcccccac 


caatccatag aaatactgtc 


ttccacaaaa atgatccctg 


gtgccaaaaa 


960 


tgttagagac 


cactccccta aaactctctt 


cttagctctc acctcctgta 


ttactatctc 


1020 
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atctcagtac 


attgaagccc 


ccatcttttc 


cccatggatg 


cctcatttcc 


tattagggag 


1080 


gcattttttt 


attttttgtt 


tttatttttt 


tccgagacgg 


agtctcgctc 


tgtcgccaag 


1140 


gctggagtgc 


agtggcgcga 


tctcggctca 


ctgcaagctc 


cgcctcccgg 


gttcacgcca 


1200 


ttctcctgcc 


tcagcctccc 


aagtagctgg 


gactacaggc 


gcccgcacta 


cgcccggcta 


1260 


attttttgta 


tttttagtag 


agacggggtt 


tcaccgtggt 


agccaggatg 


gtchcgatct 


1320 


CC ^ 


uy cl U O 


gccttggcct 


cccaaagtgc 


tgggattaca 


qqcqtqaqac 


1380 


i~w r~* t~i f~i t~* i~*f^rti~* 

eg cccygc 


oy \^ (.< a. u L. uy y 




tgtgcctcag 


gacctagcac 


agtccctggt 


1440 




rra r" 1~ a t" crt" s 

M CL^^ LiCL L>a 


atgttcgtta 


ttcaataata 


aatacatgaa 


ttaaagagtg 


1500 


agagtggatt 


ttgtaatgtt 


acgactgata 


gagaaatact 


cagtgattct 


aagggatggg 


1560 


gaagaacggt 


tggagctaga 


ggttgtgctc 


aggaaactat 


taaatagacg 


ttccgcagga 


1620 


agggattgac 


gaagtgtgag 


gttaatgagg 


aagggaaaat 


agaatataaa 


atttggtggt 


1680 


ggaaaagatc 


tgattcatga 










1700 



<210> 46 

<211> 2419 

<212> DNA 

<213> Homo sapiens 

<400> 46 



taaccagcgg 


gcccctggtc 


aagtgctggc 


tctgctgtcc 


ttgccttcca 


tttcccctct 


60 


gcacccagaa 


cagtggtggc 


aacattcatt 


gccaagggcc 


caaagaaaga 


gctacctgga 


120 


ccttttgttt 


tctgtttgac 


aacatgttta 


ataaataaaa 


atgtcttgat 


atcagtaaga 


180 


atcagagtct 


tctcactgat 


tctgggcata 


ttgatctttc 


ccccattttc 


tctacttggc 


240 


tgctccctga 


gaggactgca 


taggatagaa 


atgccttttt 


cttttctttt 


cgtttttttt 


300 


tttttttttt 


tttgagatgg 


agtctcactc 


tgtcgcccag 


gcttaagtgc 


aatggcacaa 


360 


tctcggctca 


ctgcaacctc 


tctctcctgg 


gttcaagtga 


ttctcctgcc 


tcagcctccc 


420 


aaatagctga 


gattacaggc 


atgcaccacc 


acacctggct 


aatttttgtg 


tttttagtag 


480 


agacagggtt 


tcaccgtttt 


ggccaggttg 


gtcttgaact 


cctgacctcg 


ggagatccgc 


540 


ccaccttggc 


ctctctttgt 


gctgggatta 


caggcatgag 


ccactgagcc 


gggccacttt 


600 


ttccttatca 


gtcagttttt 


acaagtcatt 


agggaggtag 


actttacctc 


tctgtgaagg 


660 


aaagtatggt 


atgttgatct 


acagagagag 


atggaaaaat 


tccagggctc 


gtagctacta 


720 


agcagaattt 


ccaagatagg 


caaattgttt 


tttctgtcaa 


ataataagct 


aatattactt 


780 


ctacaaatat 


gagaccttgg 


agagaagttt 


ccaaggacca 


agtaccaaca 


taccaacaga 


840 
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ttattatagt 




cttacacaca 


cacacacaca 


tatacacata 


tgtaatccag 


900 




aactct 1— L. ^ d u L. 


cagggtagcc accttttgtc ttaatcgaga gataattttg 


960 


atgtttgaat 




caggatattc tcttgtcatg gttattttat 


ataaaattca 


1020 


aaaaccaatt 


aOd U L.CL L. U ULf 


ctctgtaatc 


ttttacttta 


tcaactaatg 


tctggcaagt 


1080 


gtgatgtttt 


ggggaagtta 


tagaagattc 


cggccaggcg 


cttatctcac 


gcttgtaatc 


1140 


cagcactttg 


ggaagctgag 


gcggacagat 


cacgaggtca 


agagatcaag 


accatcctgg 


1200 


acaacatggt 


gaaaccttgt 


ctctactaaa 


aatgtgaaaa 


ttagctgggc 


gtggtggcac 


1260 


acacctatag 


tcccagctac 


tcgggaggct gaggcaggag aatcgcttga acctaggagg 


1320 


cggaggttgc 


actgagccga 


gatcacgcca 


ctgcactcca 


gcctgggcga 


cagagcgaga 


1380 


c c a. t. c u c o. 


aaaaaaaaaa 
a. a d cL d cL d cL cLd 


aaaaagaaag 


atcccagtztt 


atcccagttt 


atcccttatt 


1440 




^ t" r< a a na +* 1* 


iiguucccaag 


ttaacataac 


ttaggttaac 


acactctttg 


1500 


1 3.3. 3. S-IZ 3. C 3. C 


4" "H a a 1" ^ 

U.9 I" L.L.dd L. 


acagactcag 


tggttagctt 


cctgttaact 


aatttctgtt 


1560 






atttagaaag 


tggttgccaa 


taaattagtt 


ataagtcgcc 


1620 






acataattat 


tgtggtctca 


gtattcccta 


tggtggcttc 


1680 


tcct^ctcct 




tgaaatgggc 


caaaagccgt 


ggctccccaa 


tgctcaggtt 


1740 




c C9.99t 3.C 


cacctaggag 


agcccagcct 


cactgaaagt 


attcaaattt 


1800 


aggaatgyy t 


L. L.^ dy ddy L. d 


ggtagctggt 


atgtgcttag 


cacaagaatc 


tctcttcctt 


1860 




y u u u^dddd^ 


tgaaaacact 


gtcattcctt 


aagaaaatag gaaaaagtat 


1920 


UCC3a.a.CC UC 


^ /^a ^"f" a r*f a 
L.9 L-^cLv.. L-dy d 


aaatttgcca 


tattaccaaa 


tctcaaaaac 


ctctcaggaa 


1980 


3.t9^9^^^^?^ 


/-I /~i /*t ^ ^t4~ t" t" ^ ^ 
C L> ^ dy U L. U ^ U 


ggtaaactat 


ttgggccctt 


ttctcaagtt 


ctccttccag 


2040 


L>Mu L_CLldLd L. 


L L^y dyy uy dy 


gcaaagttac 


tcaagatcat 


cgctgccact 


caaggccttg 


2100 


CL U> CL ^ ^ ^ ^ CL 


i" rfa a a nctr*s t" 
L>y dddyy ^d l> 


ggaccattat 


tatattgatc 


acagcataag 


ctgtgaaaac 


2160 


^L>a^cL L.^ L. U W 


L>^^ddd^d uv.* 


tgcttggagc 


attatcatcg 


catagtttgc 


tctggtgttc 


2220 


agggaaatcg 


ctgtttcata 


ggaaatcaca tggcagtggg atgggagtgt ttcctgacct 


2280 


gccgatggta 


ctggcacctg 


agcaagcatt 


cctagtcctt 


tttggtctgg gcctcttgtt 


2340 


ctatcacaac 


cacaagctgt 


ttaaaataaa 


aacgtcaagt 


cacaggcagg 


tcattttatc 


2400 


ctgcgtgaat 


caattgaag 










2419 



<210> 47 

<211> 297 

<212> DNA 

<213> Homo sapiens 
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<400> 47 

tcctcagtgc acagtgctgc ctcgtctgag gggacaggag gatcaccctc ttcgtcgctt 60 

cggccagtgt gtcgggctgg gccctgacaa gccacctgag gagaggctcg gagccgggcc 120 

cggaccccgg cgattgccgc ccgcttctct ctagtctcac gaggggtttc ccgcctcgca 180 

cccccacctc tggacttgcc tttccttctc ttctccgcgt gtggagggag ccagcgctta 240 

ggccggagcg agcctggggg ccgcccgccg tgaagacatc gcggggaccg attcacc 2 97 

<210> 48 

<211> 1192 

<212> DNA 

<213> Homo sapiens 

<400> 48 

tgagcttttt cttaatttca ttcctttttt tggacactgg tggctcacta cctaaagcag 60 

tctatttata ttttctacat ctaattttag aagcctggct acaatactgc acaaacttgg 12 0 

ttagttcaat ttttgatccc ctttctactt aatttacatt aatgctcttt tttagtatgt 180 

tctttaatgc tggatcacag acagctcatt ttctcagttt tttggtattt aaaccattgc 24 0 

attgcagtag catcatttta aaaaatgcac ctttttattt atttattttt ggctagggag 300 

tttatccctt tttcgaatta tttttaagaa gatgccaata taatttttgt aagaaggcag 360 

taacctttca tcatgatcat aggcagttga aaaattttta cacctttttt ttcacatttt 420 

acataaataa taatgctttg ccagcagtac gtggtagcca caattgcaca atatattttc 480 

ttaaaaaata ccagcagtta ctcatggaat atattctgcg tttataaaac tagtttttaa 540 

gaagaaattt tttttggcct atgaaattgt taaacctgga acatgacatt gttaatcata 600 

taataatgat tcttaaatgc tgtatggttt attatttaaa tgggtaaagc catttacata 660 

atatagaaag atatgcatat atctagaagg tatgtggcat ttatttggat aaaattctca 720 

ahtcagagaa atcatctgat gtttctatag tcactttgcc agctcaaaag aaaacaatac 780 
cctatgtagt tgtggaagtt tatgctaata ttgtgtaact gatattaaac ctaaatgttc 
tgcctaccct gttggtataa agatattttg agcagactgt aaacaagaaa aaaaaaatca 

tgcattctta gcaaaattgc ctagtatgtt aatttgctca aaatacaatg tttgatttta 960 

tgcactttgt cgctattaac atcctttttt tcatgtagat ttcaataatt gagtaatttt 1020 

agaagcatta ttttaggaat atatagttgt cacagtaaat atcttgtttt ttctatgtac 1080 

attgtacaaa tttttcattc cttttgctct ttgtggttgg atctaacact aactgtattg 1140 

ttttgttaca tcaaataaac atcttctgtg gaccaggaaa aaaaaaaaaa aa 1192 



<210> 49 
<211> 197 



840 
900 
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<212> DNA 

<213> Homo sapiens 



<400> 49 
agacagcctt 


aacccacggg 


cgcgggcgag 


tcgtatgggc 


aggggcaggc 


gggagcgacg 


60 


tggggcgacg 


ctcacgaacg 


atcagagctg 


cgggcgacgc 


aacgaagccc 


ggaggccgca 


120 


ggctgcgcgc 


tccctcgcag 


cagccgggcg 


ggcaaaagcc 


cccagtcctc 


ggcccccgcg 


180 


caagcgacgc 


cgggaaa 










197 


<210> 50 

<211> 3293 

<212> DNA 

<213> Homo sapiens 












<400> 50 
taattattta 


tattgtaaag 


aattttaaca 


gtcctgggga 


cttccttgaa 


ggatcatttt 


60 


cacttttgct 


cagaagaaag 


ctctggatct 


atcaaataaa 


gaagtccttc 


gtgtgggcta 


120 


catatataga 


tgttttcatg 


aagaggagtg 


aaaagccaga 


aggatataga 


caaatgaggc 


180 


ctaagacctt 


tcctgccagt 


aactatactg 


tcagtagccg 


gcaaatgtta 


caagaaattc 


240 


gggaatccc t 


taggaattta 


tctaaaccat 


ctgatgctgc 


taaggctgag 


cataacatga 


300 


gtaaaatgtc 


aaccgaagat 


cctcgacaag 


tcagaaatcc 


acccaaattt 


gggacgcatc 


360 


ataaagcctt 


gcaggaaatt 


cgaaactctc 


tgcttccatt 


tgcaaatgaa 


acaaattctt 


420 


ctcggagtac 


ttcagaagtt 


aatccacaaa 


tgcttcaaga 


cttgcaagct 


gctggatttg 


480 


atgaggatat 


ggttatacaa 


gctcttcaga 


aaactaacaa 


cagaagtata 


gaagcagcaa 


540 


ttgaattcat 


tagtaaaatg 


agttaccaag 


atcctcgacg 


agagcagatg 


gctgcagcag 


600 


ctgccagacc 


tattaatgcc 


agcatgaaac 


cagggaatgt 


gcagcaatca 


gttaaccgca 


660 


aacagagctg 


gaaaggttct 


aaagaatcct 


tagttcctca 


gaggcatggc 


ccgccactag 


720 


gagaaagtgt 


ggcctatcat 


tctgagagtc 


ccaactcaca 


gacagatgta 


ggaagacctt 


780 


tgtctggatc 


tggtatatca 


gcatttgttc 


aagctcaccc 


tagcaacgga 


cagagagtga 


840 


accccccacc 


accacctcaa 


gtaaggagtg 


ttactcctcc 


accacctcca 


agaggccaga 


900 


ctccccctcc 


aagaggtaca 


actccacctc 


ccccttcatg 


ggaaccaaac 


tctcaaacaa 


960 


agcgctattc 


tggaaacatg 


gaatacgtaa 


tctcccgaat 


ctctcctgtc 


ccacctgggg 


1020- 


catggcaaga 


gggctatcct 


ccaccacctc 


tcaacacttc 


ccccatgaat 


cctcctaatc 


1080 


aaggacagag 


aggcattagt 


tctgttcctg 


ttggcagaca 


accaatcatc 


atgcagagtt 


1140 


ctagcaaatt 


taactttcca 


tcagggagac 


ctggaatgca 


gaatggtact 


ggacaaactg 


1200 


atttcatgat 


acaccaaaat 


gttgtccctg 


ctggcactgt 


gaatcggcag 


ccaccacctc 


1260 
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catatcctct gacagcagct aatggacaaa gcccttctgc 


tttacaaaca 


gggggatctg 


1320 


ctgctccttc 


gtcatataca 


aatggaagta 


ttcctcagtc 


tatgatggtg 


ccaaacagaa 


1380 


atagtcataa 


catggaacta 


tataacatta 


gtgtacctgg 


actgcaaaca 


aattggcctc 


1440 


agtcatcttc 


tgctccagcc 


cagtcatccG 


cgagcagtgg 


gcatgaaatc 


cctacatggc 


1500 


aacctaacat 


accagtgagg 


tcaaattctt 


ttaataaccc 


attaggaaat 


agagcaagtc 


1560 


actctgctaa 


ttctcagcct 


tctgctacaa 


cagtcactgc 


aattacacca gctcctattc 


1620 


aacagcctgt gaaaagtatg cgtgtattaa aaccagagct acagactgct ttagcaccta 


1680 


cacacccttc 


ttggatacca 


cagccaattc 


aaactgttca 


acccagtcct 


tttcctgagg 


1740 


gaaccgcttc 


aaatgtgact 


gtgatgccac 


ctgttgctga 


agctccaaac 


tatcaaggac 


1800 


caccaccacc 


ctacccaaaa 


catctgctgc 


accaaaaccc 


atctgttcct 


ccatacgagt 


1860 


caatcagtaa 


gcctagcaaa 


gaggatcagc 


caagcttgcc 


caaggaagat 


gagagtgaaa 


1920 


agagttatga 


aaatgttgat 


agtggggata 


aagaaaagaa 


acagattaca 


acttcaccta 


1980 


ttactgttag 


gaaaaacaag 


aaagatgaag 


agcgaaggga 


atctcgtatt 


caaagttatt 


2040 


ctcctcaagc 


atttaaattc 


tttatggagc 


aacatgtaga 


aaatgtactc 


aaatctcatc 


2100 


agcagcgtct 


acatcgtaaa 


aaacaattag 


agaatgaaat 


gatgcgggtt 


ggattatctc 


2160 


aagatgccca 


ggatcaaatg 


agaaagatgc 


tttgccaaaa 


agaatctaat 


tacatccgtc 


2220 


ttaaaagggc 


taaaatggac 


aagtctatgt 


ttgtgaagat 


aaagacacta 


ggaataggag 


2280 


catttggtga agtctgtcta gcaagaaaag tagatactaa ggctttgtat gcaacaaaaa 


2340 


ctcttcgaaa gaaagatgtt 


cttcttcgaa 


atcaagtcgc 


tcatgttaag 


gctgagagag 


2400 


atatcctggc 


tgaagctgac 


aatgaatggg 


tagttcgtct 


atattattca 


ttccaagata 


2460 


aggacaattt 


atactttgta 


atggactaca 


ttcctggggg 


tgatatgatg 


agcctattaa 


2520 


ttagaatggg 


catctttcca gaaagtctgg 


cacgattcta 


catagcagaa 


cttacctgtg 


2580 


cagttgaaag tgttcataaa 


atgggtttta 


ttcatagaga 


tattaaacct 


gataatattt 


2640 


tgattgatcg tgatggtcat 


attaaattga 


ctgactttgg 


cctctgcact 


ggcttcagat 


2700 


ggacacacga 


ttctaagtac 


tatcagagtg gtgaccatcc 


acggcaagat 


agcatggatt 


2760 


tcagtaatga 


atggggggat 


ccctcaagct 


gtcgatgtgg 


agacagactg 


aagccattag 


2820 


agcggagagc 


tgcacgccag 


caccagcgat 


gtctagcaca 


ttctttggtt 


gggactccca 


2880 


attatattgc 


acctgaagtg 


ttgctacgaa 


caggatacac 


acagttgtgt 


gattggtgga 


2940 


gtgttggtgt 


tattcttttt 


gaaatgttgg 


tgggacaacc 


tcctttcttg 


gcacaaacac 


3000 


cattagaaac acaaatgaag gtcacctgct gctatataca tcattggctc gagaagaaac 


3060 
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tactgaacac cctgcgagag agaagcctag aaaagaaaga aagggccaaa aggttttgaa 312 0 

ctcttcatcc ctaatttgct acactgatca aaaccaagta agggctcctg aagtccatga 3180 

gtctatcatc aatcagcaca aatgctatac tagtttgtaa ctgcggggtc agttgtgaag 3240 

gggaaggaca gcagtcttat ccatattcca ggaagccaca gtaaactgct cga 32 93 

<210> 51 

<211> 424 

<212> DNA 

<213> Homo sapiens 

<400> 51 

cctactctat tcagatattc tccagattcc taaagattag agatcatttc tcattctcct 60 

aggagtactc acttcaggaa gcaaccagat aaaagagagg tgcaacggaa gccagaacat 120 

tcctcctgga aattcaacct gtttcgcagt ttctcgagga atcagcattc agtcaatccg 180 

ggccgggagc agtcatctgt ggtgaggctg attggctggg caggaacagc gccggggcgt 240 

gggctgagca cagcgcttcg ctctctttgc cacaggaagc ctgagctcat tcgagtagcg 300 

gctcttccaa gctcaaagaa gcagaggccg ctgttcgttt cctttaggtc tttccactaa 360 

agtcggagta tcttcttcca agatttcacg tcttggtggc cgttccaagg agcgcgaggt 420 

cggg 424 



<210> 52 

<211> 706 

<212> DNA 

<213> Homo sapiens 



60 
120 
180 

240 
300 
360 
420 
480 

540 
600 
660 



<400> 52 
tgaactctga 


ctgtatgaga 


tgttaaatac 


tttttaatat 


ttgtttagat 


atgacattta 


ttcaaagtta 


aaagcaaaca 


cttacagaat 


tatgaagagg 


tatctgttta 


acatttcctc 


agtcaagttc 


agagtcttca 


gagacttcgt 


aattaaagga 


acagagtgag 


agacatcatc 


aagtggagag 


aaatcatagt 


ttaaactgca 


ttataaattt 


tataacagaa 


ttaaagtaga 


ttttaaaaga 


taaaatgtgt 


aattttgttt 


atattttccc 


atttggactg 


taactgactg 


ccttgctaaa 


agattataga 


agtagcaaaa 


agtattgaaa 


tgtttgcata 


aagtgtctat 


aataaaacta 


aactttcatg 


tgactggagt 


catcttgtcc 


aaactgcctg 


tgaatatatc 


ttctctcaat 


tggaatattg 


tagataactt 


ctgcbttaaa 


aaagttttct 


ttaaatatac 


ctactcattt 


ttgtgggaat 


ggttaagcag 


tttaaataat 


tcctgtgtat 


atgtctatca 


cataggggtc 


taacagaaca 


atctggattc 


attatttcta 


ggacttgatc 


ctgctgatgc 


tgaatttgca 


cattaaggtg 


tgttaacaac 


caaaacacag 


atcgatataa 


gaagtaagga 
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ggtggggaga ggcaaattat gatgtgctat gagttagatg tatagt 



706 



<210> 53 

<211> 239 

<212> DNA 

<213> Homo sapiens 

<400> 53 

agtccgcggc gttccccggc tgcagccggg agggggccga ggagtgactg agccccgggc 60 

tgtgcagtcc gacgccgact gaggcacgag cgggtgacgc tgggcctgca gcgcggagca 12 0 

gaaagcagaa cccgcagagt cctccctgct gctgtgtgga cgacacgtgg gcacaggcag 180 

aagtgggccc tgtgaccagc tgcactggtt tcgtggaagg aagctccagg actggcggg 23 9 

<210> 54 

<211> 641 

<212> DNA 

<213> Homo sapiens 

<400> 54 

tgaggcagct gctatcccca tctccctgcc tggcccccaa cctcagggct cccaggggtc 60 

tccctggctc cctcctccag gcctgcctcc cacttcactg cgaagaccct cttgcccacc 120 

ctgactgaaa gtagggggct ttctggggcc tagcgatctc tcctggccta tccgctgcca 180 

gccttgagcc ctggctgttc tgtggttcct ctgctcaccg cccatcaggg ttctcttatc 240 

aactcagaga aaaatgctcc ccacagcgtc cctggcgcag gtgggctgga cttctacctg 3 00 

ccctcaaggg tgtgtatatt gtataggggc aactgtatga aaaattgggg aggagggggc 360 

cgggcgcggt gctcacgcct gtaatcccag cactttggga ggccgaggcg ggtggatcac 420 

gaggtcagga gatcgagacc atcctggcta acatggtgaa accccgtctc tactaaaaat 480 

acaaaaaaaa tttagccggg cgcggtggcg ggcacctgta gtcccagcta cttgggaggc 540 

tgaggcagga gaatggtgtg aacccgggag cggaggttgc agtgagctga gatcgtgcta 600 

ctgcactcca gcctggggga cagaaagaga ctccgtctca a 641 

<210> 55 

<211> 493 

<212> DNA 

<213> Homo sapiens 

<400> 55 

tttctgtgaa gcagaagtct gggaatcgat ctggaaatcc tcctaatttt tactccctct 60 

ccccccgact cctgattcat tgggaagttt caaatcagct ataactggag agagctgaag 120 

attgatggga tcgttgcctt atgcctttgt tttggtttta caaaaaggaa acttgacaga 180 

ggatcatgct atacttaaaa aatacaacat cgcagaggaa gtagactcat attaaaaata 240 
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cttactaata 


ataacgtgcc 


tcatgaagta 


aagatccgaa 


aggaattgga 


ataaaacttt 


300 


cctgcatctc 


aagccaaggg 


ggaaacacca 


gaatcaagtg ttccgcgtga ttgaagacac 


360 


cccctcgtcc 


aagaatgcaa 


agcacatcca 


ataaaagagc 


tggattataa 


ctcctcttct 


420 


ttctctgggg 


gccgtggggt 


gggagctggg 


gcgagaggtg 


ccgttggccc 


ccgttgcttt 


480 


tcctctggga 


ggg 










493 


<210> 56 

<211> 5282 

<212> DNA 

<213> Homo sapiens 












<400> 56 
tgaagtcaac 


atgcctgccc 


caaacaaata 


tgcaaaaggt 


tcactaaagc 


agtagaaata 


60 


atatgcattg 


tcagtgatgt 


tccatgaaac 


aaagctgcag 


gctgtttaag 


aaaaaataac 


120 


acacatataa 


acatcacaca 


cacagacaga 


cacacacaca 


cacaacaatt 


aacagtcttc 


180 


aggcaaaacg 


tcgaatcagc 


tatttactgc 


caaagggaaa 


fcatcatttat 


tttttacatt 


240 


attaagaaaa 


aaagatttat 


ttatttaaga 


cagtcccatc 


aaaactcctg 


tctttggaaa 


300 


tccgaccact 


aattgccaag 


caccgcttcg 


tgtggctcca 


cctggatgtt 


ctgtgcctgt 


360 


aaacatagat 


tcgctttcca 


tgttgttggc 


cggatcacca 


tctgaagagc 


agacggatgg 


420 


aaaaaggacc 


tgatcattgg ggaagctggc 


tttctggctg 


ctggaggctg 


gggagaaggt 


480 


gttcattcac 


ttgcatttct ttgcGctggg ggctgtgata ttaacagagg gagggttcct 


540 


gtggggggaa 


gtccatgcct 


ccctggcctg 


aagaagagac 


tctttgcata 


tgactcacat 


600 


gatgcatacc 


tggtgggagg aaaagagttg ggaacttcag atggacctag tacccactga 


660 


gatttccacg 


ccgaaggaca 


gcgatgggaa 


aaatgccctt 


aaatcatagg 


aaagtatttt 


720 


tttaagctac 


caattgtgcc 


gagaaaagca 


ttttagcaat 


ttatacaata 


tcatccagta 


780 


ccttaagccc 


tgattgtgta 


tattcatata 


ttttggatac 


gcacccccca 


actcccaata 


840 


ctggctctgt 


ctgagtaaga 


aacagaatcc 


tctggaactt 


gaggaagtga 


acatttcggt 


900 


gacttccgca 


tcaggaaggc 


tagagttacc 


cagagcatca 


ggccgccaca 


agtgcctgct 


950 


tttaggagac 


cgaagtccgc 


agaacctgcc 


tgtgtcccag 


cttggaggcc 


tggtcctgga 


1020 


actgagccgg 


ggccctcact 


ggcctcctcc 


agggatgatc 


aacagggcag 


tgtggtctcc 


1080 


gaatgtctgg 


aagctgatgg 


agctcagaat 


tccactgtca 


agaaagagca gtagaggggt 


1140 


gtggctgggc 


ctgtcaccct 


ggggccctcc 


aggtaggccc 


gttttcacgt 


ggagcatggg 


1200 


agccacgacc 


cttcttaaga 


catgtatcac 


tgtagaggga 


aggaacagag 


gccctgggcc 


1260 


cttcctatca 


gaaggacatg gtgaaggctg ggaacgtgag gagaggcaat 


ggccacggcc 


1320 
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cattttggct 


gtagcacatg gcacgttggc tgtgtggcct tggcccacct gtgagtttaa 


1380 


agcaaggctt 


taaatgactt 


tggagagggt 


cacaaatcct 


aaaagaagca 


ttgaagtgag 


1440 


gtgtcatgga 


ttaattgacc 


cctgtctatg 


gaattacatg 


taaaacatta 


tcttgtcact 


1500 


gtagtttggt 


tttatttgaa 


aacctgacaa 


aaaaaaagtt 


ccaggtgtgg 


aatatggggg 


1560 


ttatctgtac 


atcctggggc 


attaaaaaaa 


aaatcaatgg 


tggggaacta 


taaagaagta 


1620 


acaaaagaag 


tgacatcttc 


agcaaataaa 


ctaggaaatt 


tttttttctt 


ccagtttaga 


1680 


atcagccttg 


aaacattgat 


ggaataactc 


tgtggcatta ttgcattata 


taccatttat 


1740 


ctgtattaac 


tttggaatgt 


actctgttca 


atgtttaatg 


ctgtggttga 


tatttcgaaa 


1800 


gctgctttaa aaaaatacat gcatctcagc gtttttttgt ttttaattgt atttagttat 


1860 


ggcctataca 


ctatttgtga 


gcaaaggtga 


tcgttttctg 


tttgagattt 


ttatctcttg 


1920 


attcttcaaa 


agcattctga gaaggtgaga taagccctga ghctcagcta 


cctaagaaaa 


1980 


acctggatgt 


cactggccac 


tgaggagctt 


tgtttcaacc 


aagtcatgtg 


catttccacg 


2040 


tcaacagaat 


tgtttattgt 


gacagttata 


tctgttgtcc 


ctttgacctt 


gtttcttgaa 


2100 


ggtttcctcg 


tccctgggca 


attccgcatt 


haattcatgg 


tattcaggat 


tacatgcatg 


2160 


tttggttaaa 


cccatgagat 


tcattcagtfc 


aaaaatccag 


atggcaaatg 


accagcagat 


2220 


tcaaatctat 


ggtggtttga 


cctttagaga 


gttgctttac 


gtggcctgtt 


tcaacacaga 


2280 


cccacccaga gccctcctgc 


cctccttccg 


cgggggcttt 


ctcatggctg 


tccttcaggg 


2340 


tcttcctgaa 


atgcagtggt 


gcttacgctc 


caccaagaaa 


gcaggaaacc 


tgtggtatga 


2400 


agccagacct 


ccccggcggg 


cctcagggaa 


cagaatgatc 


agacctttga 


atgattctaa 


2460 


tttttaagca 


aaatattatt 


ttatgaaagg 


tttacattgt 


caaagtgatg 


aatatggaat 


2520 


atccaatcct 


gtgctgctat 


cctgccaaaa 


tcattttaat 


ggagtcagtt 


tgcagtatgc 


2580 


tccacgtggt 


aagatcctcc 


aagctgcttt 


agaagtaaca 


atgaagaacg 


tggacgcttt 


2640 


taatataaag 


cctgttttgt 


cttctgttgt 


tgttcaaacg ggattcacag 


agtatttgaa 


2700 


aaatgtatat 


atattaagag gtcacggggg 


ctaattgctg gctggctgcc 


ttttgctgtg 


2760 


gggttttgtt 


acctggtttt 


aataacagta 


aatgtgccca 


gcctcttggc 


cccagaactg 


2820 


tacagtattg 


tggctgcact 


tgctctaaga 


gtagttgatg 


ttgcattttc 


cttattgtta 


2880 


aaaacatgtt 


agaagcaatg 


aatgtahata 


aaagcctcaa 


ctagtcattt 


ttttctcctc 


2940 


ttcttttttt 


tcattatatc 


taattatttt 


gcagttgggc 


aacagagaac 


catccctatt 


3000 


ttgtattgaa 


gagggattca 


catctgcatc 


ttaactgctc 


tttatgaatg 


aaaaaacagt 


3060 


cctctgtatg 


tactcctctt 


tacactggcc 


agggtcagag 


ttaaatagag 


tatatgcact 


3120 
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ttccaaattg gggacaaggg 


ctctaaaaaa 






atctqacraac 


3180 


ctcctcggcc 


ctcccagtcc 


ctcgctgcac 


aaatactccg 


caagagaggc 


cagaatgaca 


3240 


gctgacaggg 


tctatggcca 


tcgggtcgtc 


tccgaagatt 


tggcaggggc 


agaaaactct 


3300 


ggcaggctta 


agatttggaa taaagtcaca 


gaatcaagga 


agcacctcaa 


tttagttcaa 


3360 


acaagacgcc 


aacattctct ccacagctca cttacctctc tgtgttcaga tgtggccttc 


3420 


catttatatg tgatctttgt tttattagta aatgcttatc atctaaagat gtagctctgg 


3480 


cccagtggga aaaattagga agtgattata aatcgagagg agttataata atcaagatta 


3540 


aatgtaaata 


atcagggcaa 


tcccaacaca 


tgtctagctt tcacctccag gatctattga 


3600 


gtgaacagaa 


ttgcaaatag 


tctctatttg 


taattgaact 


tatcctaaaa 


caaatagttt 


3660 


ataaatgtga 


acttaaactc 


taattaattc 


caactgtact 


tttaaggcag 


tggctgtttt 


3720 


tagactttct 


tatcacttat 


agttagtaat 


gtacacctac 


tctatcagag 


aaaaacagga 


3780 


aaggctcgaa 


atacaagcca 


ttctaaggaa 


attagggagt 


cagttgaaat 


tctattctga 


3840 


tcttattctg 


tggtgtcttt 


tgcagcccag 


acaaatgtgg 


ttacacactt 


tttaagaaat 


3900 


acaattctac 


attgtcaagc 


ttatgaaggt 


tccaatcaga 


tctttattgt 


tattcaattt 


3960 


ggatctttca gggatttttt 


ttttaaatta 


ttatgggaca 


aaggacattt 


gttggagggg 


4020 


tgggagggag gaacaatttt 


taaatataaa 


acattcccaa 


gtttggatca 


gggagttgga 


4080 


agttttcaga 


ataaccagaa 


ctaagggtat 


gaaggacctg 


tattggggtc 


gatgtgatgc 


4140 


ctctgcgaag 


aaccttgtgt 


gacaaatgag 


aaacattttg 


aagtttgtgg 


tacgaccttt 


4200 


agattccaga 


gacatcagca tggctcaaag 


tgcagctccg 


tttggcagtg 


caatggtata 


4260 


aatttcaagc 


tggatatgtc 


taatgggtat 


ttaaacaata 


aatgtgcagfc 


tttaactaac 


4320 


aggatattta 


atgacaacct 


tctggttggt 


agggacatct 


gtttctaaat 


gtttattatg 


4380 


tacaatacag 


aaaaaaattt 


tataaaatta 


agcaatgtga 


aactgaattg 


gagagtgata 


4440 


atacaagtcc 


tttagtctta 


cccagtgaat 


cattctgttc 


catgtctttg 


gacaaccatg 


4500 


accttggaca 


atcatgaaat 


atgcatctca 


ctggatgcaa 


agaaaatcag 


atggagcatg 


4560 


aatggtactg 


taccggttca 


tctggactgc 


cccagaaaaa 


taacttcaag 


caaacatcct 


4620 


atcaacaaca 


aggttgttct 


gcataccaag 


ctgagcacag 


aagatgggaa 


cactggtgga 


4680 


ggatggaaag 


gctcgctcaa 


tcaagaaaat 


tctgagacta 


ttaafcaaata 


agactgtagt 


4740 


gtagatactg agtaaatcca tgcacctaaa 


ccttttggaa 


aatctgccgt 


gggccctcca 


4800 


gatagctcat 


ttcattaagt 


ttttccctcc 


aaggtagaat 


ttgcaagagt 


gacagtggat 


4860 


tgcatttctt 


ttggggaagc 


tttcttttgg tggttttgtt 


tattatacct 


tcttaagttt 


4920 


tcaaccaagg 


tttgcttttg 


ttttgagtta 


ctggggttat 


ttttgtttta 


aataaaaata 


4980 
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agtgtacaat 


aagtgttttt 


gtattgaaag 


cttttgttat 


caagattttc 


atacttttac 


5040 


ctfcccatggc 


tctttttaag 


attgatactt 


ttaagaggtg gctgatattc 


tgcaacactg 


5100 


tacacataaa 


aaatacggta 


aggatacttt 


acatggttaa 


ggtaaagtaa 


gtctccagtt 


5160 


ggccaccatt 


agctataatg 


gcactttgtt 


tgtgttgttg 


gaaaaagtca 


cattgccatt 


5220 


aaactttcct 


tgtctgtcta gttaatattg tgaagaaaaa 


taaagtacag 


tgtgagatac 


5280 


tg 












5282 



<210> 57 

<211> 117 

<212> DNA 

<213> Homo sapiens 

<400> 57 

attcggggcg agggaggagg aagaagcgga ggaggcggct cccgctcgca gggccgtgca 60 
cctgcccgcc cgcccgctcg ctcgctcgcc cgccgcgccg cgctgccgac cgccagc 117 



<210> 58 

<211> 430 

<212> DNA 

<213> Homo sapiens 

<400> 58 

tgatccaggg agcccccacc atccgggggg accccgagtg tcatctcttc tacaatgagc 60 

agcaggaggc ttgcggggtg cacacccagc ggatgcagta gaccgcagcc agccggtgcc 12 0 

tggcgcccct gccccccgcc cctctccaaa caccggcaga aaacggagag tgcttgggtg 180 

gtgggtgctg gaggattttc cagttctgac acacgtattt atatttggaa agagaccagc 240 

accgagctcg gcacctcccc ggcctctctc ttcccagctg cagatgccac acctgctcct 3 00 

tcttgctttc cccgggggag gaagggggtt gtggtcgggg agctggggta caggtttggg 3 60 

gagggggaag agaaattttt atttttgaac ccctgtgtcc cttttgcata agattaaagg 420 
aaggaaaagt 



430 



<210> 59 

<211> 192 

<212> DNA 

<213> Homo sapiens 

<400> 59 

tcctaggcgg cggccgcggc ggcggaggca gcagcggcgg cggcagtggc ggcggcgaag 60 

gtggcggcgg ctcggccagt actcccggcc cccgccattt cggactggga gcgagcgcgg 120 

cgcaggcact gaaggcggcg gcggggccag aggctcagcg gctcccaggt gcgggagaga 180 

ggcctgctga aa 192 
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<210> 60 

<211> 4172 

<212> DNA 

<213> Homo sapiens 

<400> 60 



taaatacaat 


ttgtactttt 


ttcttaaggc 


atactagtac 


aagtggtaat 


ttttgtacat 


60 


tacactaaat 


tattagcatt 


tgttttagca 


ttacctaatt 


tttttcctgc 


tccatgcaga 


120 


ctgttagctt 


ttaccttaaa 


tgcttatttt 


aaaatgacag 


tggaagtttt 


tttttcctcg 


180 


aagtgccagt 


attcccagag 


U U Ti. 3 3 U U 


ua^^^ ^^a 


a "h aT" f I" cr t* cr a 


aaaagaaact 


240 


gaatacctaa gatttctgtc 


I- 1- a a y a t- 1- 1- 1 


L.aa i-a^^^'^a " 


a crl" "i" era "h t" a r* 


ttcttatttt 


300 


tcttaccaag 


tgtgaatgtt 


aa ^3 i-a^^^^ 


a a a 1~ t" a a Irrs 


a.gct tt tgaa 


tcatccctat 


360 


tctgtgtttt 


atctagtcac 


a 4- a a sa ^ n/^s 4* 
a. Ua-ad L.aa ^ ^ 




i" i~ i" r" a.a" t fc era 


gaccttctaa 


420 


ttggttttta 


ctgaaacatt 


a a a a ^*-'CL^_>a. 


a.a. 1 1 1 a. 1 9"99 


cttcctgatg 


atgattcttc 


480 


taggcatcat 


gtcctatagt 


ttgtcatccc 


tgatgaatgt 


aaaglztacac 


tgttcacaaa 


540 


ggttttgtct 


cctttccact 


gctattagtc 


atggtcactc 


tccccaaaat 


attatatttt 


600 


ttctataaaa 


agaaaaaaat 


ggaaaaaaat 


tacaaggcaa 


tgg3^a^£Lctat 


tataaggcca 


660 


tttccttttc 


acattagata 


aattactata 


aagactccta 


atagcttttt 


cctgttaagg 


720 


cagacccagt 


atgaatggga 


ttattatagc 


aaccattttg 


gggctatatt 


tacatgctac 


780 


taaattttta 


taataattga 


aaagatttta 


acaagtataa 


aaaaattctc 


ataggaatta 


840 


aatgtagtct 


ccctgtgtca 


gactgctctt 


tcatagtata 


actttaaatc 


ttttcttcaa 


900 


cttgagtctt 


tgaagatagt 


tttaattctg cttgtgacat 


taaaagatta 


tttgggccag 


960 


ttatagctta 


ttaggtgttg 


aagagaccaa ggttgcaagc 


caggccctgt 


gtgaaccttg 


1020 


agctttcata gagagtttca 


cagcatggac 


tgtgtgcccc 


acggtcatcc 


gagtggttgt 


1080 


acgatgcatt 


ggttagtcaa 


aaatggggag ggactagggc 


agtttggata gctcaacaag 


1140 


atacaatctc 


actctgtggt 


ggtcctgctg acaaatcaag 


agcattgctt 


ttgtttctta 


1200 


agaaaacaaa 


ctctttttta 


aaaattactt 


ttaaatatta 


actcaaaagt 


tgagattttg 


1260 


gggtggtggt 


gtgccaagac 


attaattttt 


tttttaaaca 


atgaagtgaa 


aaagttttac 


1320 


aatctctagg 


tttggctagt 


tctcttaaca 


ctggttaaat 


taacattgca 


taaacacttt 


1380 


tcaagtctga 


tccatattta 


ataatgcttt 


aaaataaaaa 


taaaaacaat 


ccttttgata 


1440 


aatttaaaat 


gttacttatt 


ttaaaataaa 


tgaagtgaga 


tggcatggtg 


aggtgaaagt 


1500 


atcactggac 


taggttgttg gtgacttagg ttctagatag 


gtgtctttta ggactctgat 


1560 
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tttgaggaca 


tcacttacta tccatttctt catgttaaaa gaagtcatct 


caaactctta 


1620 


gttttttttt 


tttacactat 


gtgatttata ttccatttac 


ataaggatac 


acttatttgt 


1680 


caagctcagc 


acaatctgta 


aatttttaac 


ctatgttaca 


ccatcttcag 


tgccagtctt 


1740 


gggcaaaatt gtgcaagagg tgaagtttat atttgaatat 


ccattctcgt 


tttaggactc 


1800 


ttcttccata ttagtgtcat 


cttgcctccc 


taccttccac 


atgccccatg 


acttgatgca 


1860 


gttttaatac 


ttgtaattcc 


cctaaccata 


agatttactg 


ctgctgtgga 


tatctccatg 


1920 


aagttttccc 


actgagtcac 


atcagaaatg 


ccctacatct 


tattttcctc 


agggctcaag 


1980 


agaatctgac 


agataccata 


aagggatttg acctaatcac 


taattttcag 


gtggtggctg 


2040 


atgctttgaa 


catctctttg 


ctgcccaatc 


cattagcgac 


agtaggattt 


ttcaaccctg 


2100 


gtatgaatag 


acagaaccct 


atccagtgga 


aggagaattt 


aataaagata gtgcagaaag 


2160 


aattccttag 


gtaatctata 


actaggacta 


ctcctggtaa 


cagtaataca 


ttccattgtt 


2220 


ttagtaacca 


gaaatcttca 


tgcaatgaaa 


aatactttaa 


ttcatgaagc 


ttactttttt 


2280 


ttttttggtg 


tcagagtctc 


gctcttgtca 


cccaggctgg 


aatgcagtgg 


cgccatctca 


2340 


gctcactgca 


accttccatc 


ttcccaggtt 


caagcgattc 


tcgtgcctcg gcctcctgag 


2400 


tagctgggat 


tacaggcgtg 


tgcactacac 


tcaactaatt 


tttgtatttt 


taggagagac 


2460 


ggggtttcac 


ctgttggcca ggctggtctc gaactcctga 


cctcaagtga 


ttcacccacc 


2520 


ttggcctcat 


aaacctgttt 


tgcagaactc 


atttattcag 


caaatattta 


ttgagtgcct 


2580 


accagatgcc agtcaccgca caaggcactg ggtatatggt 


atccccaaac 


aagagacata 


2640 


atcccggtcc 


ttaggtactg 


ctagtgtggt 


ctgtaatatc 


ttactaaggc 


ctttggtata 


2700 


cgacccagag 


ataacacgat gcgtatttta gttttgcaaa gaaggggttt ggtctctgtg 


2760 


ccagctctat 


aattgttttg 


ctacgattcc 


actgaaactc 


ttcgatcaag 


ctactttatg 


2820 


taaatcactt 


cattgtttta 


aaggaataaa 


cttgattata 


ttgttttttt 


atttggcata 


2880 


actgtgattc 


ttttaggaca 


attactgtac 


acattaaggt 


gtatgtcaga 


tattcatatt 


2940 


gacccaaatg 


tgtaatattc 


cagttttctc 


tgcataagta 


attaaaatat 


acttaaaaat 


3000 


taatagtttt 


atctgggtac 


aaataaacag 


tgcctgaact 


agttcacaga 


caagggaaac 


3060 


ttctatgtaa 


aaatcactat 


gatttctgaa 


ttgctatgtg 


aaactacaga 


tctttggaac 


3120 


actgtttagg 


tagggtgtta 


agacttgaca 


cagtacctcg 


tttctacaca 


gagaaagaaa 


3180 


tggccatact 


tcaggaactg 


cagtgcttat 


gaggggatat 


ttaggcctct 


tgaatttttg 


3240 


atgtagatgg 


gcattttttt 


aaggtagtgg 


ttaattacct 


ttatgtgaac 


tttgaatggt 


3300 


ttaacaaaag 


atttgttttt 


gtagagattt 


taaaggggga 


gaattctaga 


aataaatgtt 


3360 


acctaattat 


tacagcctta 


aagacaaaaa 


tccttgttga 


agttttttta 


aaaaaagact 


3420 
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aaattacata gacttaggca ttaacatgtt tgtggaagaa tatagcagac gtatattgta 34 80 

tcatttgagt gaatgttccc aagtaggcat tctaggctct atttaactga gtcacactgc 3540 

ataggaattb agaacctaac ttttataggt tatcaaaact gttgtcacca ttgcacaatt 3600 

ttgtcctaat atatacatag aaactttgtg gggcatgtta agttacagtt tgcacaagtt 3660 

catctcattt gtattccatt gatttttttt tttcttctaa acattttttc ttcaaaacag 3720 

tatatataac tttttttagg ggattttttt tagacagcaa aaaactatct gaagatttcc 37 80 

atttgtcaaa aagtaatgat ttcttgataa ttgtgtagtg aatgtttttt agaacccagc 3840 

agttaccttg aaagctgaat ttatatttag taacttctgt gttaatactg gatagcatga 3900 

attctgcatt gagaaactga atagctgtca taaaatgctt tctttcctaa agaaagatac 3960 

tcacatgagt tcttgaagaa tagtcataac tagattaaga tctgtgtttt agtttaatag 4020 

tttgaagtgc ctgtttggga taatgatagg taatttagat gaatttaggg gaaaaaaaag 4 080 

ttatctgcag ttatgttgag ggcccatctc tccccccaca cccccacaga gctaactggg 4140 

ttacagtgtt ttatccgaaa gtttccaatt cc 4172 

<210> 61 

<211> 238 

<212> DNA 

<213> Homo sapiens 

<400> 61 

ccattgtgct ggaaaggcgc gcaacggcgg cgacggcggc gaccccaccg cgcatcctgc 60 

caggcctccg cgcccagccg cccacgcgcc cccgcgcccc gcgccccgac cctttcttcg 120 

cgcccccgcc cctcggcccg ccaggccccc ttgccggcca cccgccaggc cccgcgccgg 180 

cccgcccgcc gcccaggacc ggcccgcgcc ccgcaggccg cccgccgccc gcgccgcc 23 8 

<210> 62 

<211> 547 

<212> DMA 

<213> Homo sapiens 

<400> 52 

ggccccgcag ctctggccac agggacctct gcagtgcccc ctaagtgacc cggacacttc 60 

cgagggggcc atcaccgcct gtgtatataa cgtttccggt attactctgc tacacgtagc 120 

ctttttactt ttggggtttt gtttttgttc tgaactttcc tgttaccttt tcagggctga 180 

tgtcacatgt aggtggcgtg tatgagtgga gacgggcctg ggtcttgggg actggagggc 240 

aggggtcctt ctgcccctgg ggtcccaggg tgctctgcct gctcagccag gcctctcctg 300 

ggagccactc gcccagagac tcagcttggc caacttgggg ggctgtgtcc acccagcccg 360 
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cccgtcctgt gggctgcaca gctcaccttg ttccctcctg ccccggttcg agagccgagt 
ctgtgggcac tctctgcctt catgcacctg tcctttctaa cacgtcgcct tcaactgtaa 
tcacaacatc ctgactccgt catttaataa agaaggaaca tcaggcatgc taaaaaaaaa 
aaaaaaa 

<210> 63 

<211> 102 

<212> DNA 

<213> Homo sapiens 

<400> 63 



gaattccggc 


aaacatgagg 


cagctgccag 


ccggcctggg cagtcttgtc 


tgcctcggcf 


60 


gtgaagtggg gaggctggca 


acagttttct 


tcagcgccca gg 




102 


<210> 64 

<211> 2017 

<212> DNA 

<213> Homo sapiens 










<400> 64 
gacacgtcca 


aaggagtgca 


tggccacagc 


cacctccacc cccaagaaac 


ctccatcctg 


60 


ccaggagcag 


cctccaagaa 


acttttaaaa 


aatagatttg caaaaagtga 


acagattgct 


12 0 


acacacacac 


acacacacac 


acacacacac 


acacacagcc attcatctgg 


gctggcagag 


180 


gggacagagt 


tcagggaggg 


gctgagtctg 


gctaggggcc gagtccagag 


gccccagcca 


240 


gcccttccca 


ggccagcgag gcgaggctgc 


ctctgggtga gtggctgaca gagcaggtct 


300 


gcaggccacc 


agctgctgga 


tgtcaccaag 


aaggggctcg agtgccctgc 


aggagggtcc 


360 


aatcctccgg 


tcccacctcg tcccgttcat 


ccattctgct ttcttgccac 


acagtggccg 


420 


gcccaggctc 


ccctggtctc 


ctccccgtag 


ccactctctg cccactacct 


atgcttctag 


480 


aaagcccctc 


acctcaggac 


cccagaggac 


cagctggggg gcagggggga 


gagggggtaa 


540 


tggaggccaa 


gcctgcagct 


ttctggaaat 


tcttccctgg gggtcccagt 


atcccctgct 


600 


actccactga 


cctggaagag 


ctgggtacca 


ggccacccac tgtggggcaa 


gcctgagtgg 


660 


tgaggggcca 


ctggcatcat tctccctcca tggcaggaag gcgggggatt tcaagtttag 


720 


ggattgggtc 


gtggtggaga 


atctgagggc 


actctgccag ctccacaggt ggatgagcct 


780 


ctccttgccc 


cagtcctggt 


tcagtgggaa 


tgcagtgggt ggggctgtac 


acaccctcca 


840 


gcacagactg 


ttccctccaa 


ggtcctctta 


ggtcccgggg aggaacgtgg 


ttcagagact 


900 


ggcagccagg 


gagcccgggg 


cagagctcag 


aggagtctgg gaaggggcgt 


gtccctcctc 


960 


ttcctgtagt 


gcccctccca 


tggcccagca 


gcttggctga gcccctctcc 


tgaagcagct 


1020 
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gtgcgccgtc 


L. V-f Uy o t-. L. L- 


gcacaaaaag 


cacaagacat 


t- r* r* i" +" acre acf 


ctcagcgcag 


1080 


c c c t agtggg 




actgcttctc 


ggaggccagg 


ccctcctgct 


ggctgagctt 


1140 


gggcccggtg 


gccccaatat 


ggtggccctg gggaagaggc 


cttgggggtc 


tgctctgtgc 


1200 


ctggga.tcag 


tggggcccca 


aagcccagcc 


cggctgacca 


acattcaaaa 


gcacaaaccc 


1260 


tggggactct 


gcttggctgt 


cccctccatc 


tggggatgga 


gaatgcagcc 


c aaagc tgga 


1320 


gccaatggtg 


agggctgaga gggctgtggc 


tgggtggtca gcagaaaccc 


c acrcracrcracfa 


1380 


gagatgctgc 


tcccgcctga 


ttggggcctc 


acccagaagg 


aacccggtcc 


cagccgcatg 


1440 


gcccctccag 


gaacattccc 


acataataca 


ttccatcaca 


gccagcccag 


ctccactcag 


1500 


ggctggcccg 


gggagtcccc 


gtgtgcccca 


agaggctagc 


cccagggtga 


gcagggc c ct 


1560 


Gaga.gga.a.ag 


gcagtatggc 


ggaggccatg ggggcccctc 


ggcattcaca 


cacagcctgg 


1620 


cctcccctgc 


ggagctgcat 


ggacgcctgg 


ctccaggctc 


caggctgact 


ggggcctctg 


1680 


cctccaggag 


ggcatcagct 


ttccctggct 


cagggatctt 


ctccctcccc 


tcacccgctg 


1740 




ccagctgatg 


tcactctgcc 


tctaagccaa 


ggcctcagga gagcatcacc 


1800 


accacaccct 


gcggccttgc 


cttggggcca gactggctgc 


acagcccaac 


caggaggggt 


1860 


ctgcctccca 


cgctgggaca 


cagaccggcc 


gcatgtctgc 


atggcagaag 


cgtctccctt 


1920 


gccacggcct 


gggagggtgg 


ttcctgttct 


cagcatccac 


taatattcag 


tcctgtatat 


1980 


tttaataaaa 


taaacttgac 


aaaggaaaaa 


aaaaccg 






2017 



<210> 55 

<211> 97 

<212> DNA 

<213> Homo sapiens 

<400> 65 

gtccaggaac tcctcagcag cgcctccttc agctccacag ccagacgccc tcagacagca 60 
aagcctaccc ccgcgccgcg ccctgcccgc cgctgcg 97 

<210> 66 

<211> 1474 

<212> DNA 

<213> Homo sapiens 

<400> 66 

aagtctaatg atcatattta tttatttata tgaaccatgt ctattaattt aattatttaa 60 

taatatttat attaaactcc ttatgttact taacatcttc tgtaacagaa gtcagtactc 120 

ctgttgcgga gaaaggagtc atacttgtga agacttttat gtcactactc taaagatttt 180 

gctgttgctg ttaagtttgg aaaacagttt ttattctgtt ttataaacca gagagaaatg 240 

agttttgacg tctttttact tgaatttcaa cttatattat aaggacgaaa gtaaagatgt 3 00 
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ttgaatactt aaacactatc acaagatgcc aaaatgctga aagtttttac actgtcgatg 360 

tttccaatgc atcttccatg atgcattaga agtaactaat gtttgaaatt ttaaagtact 420 

tttgggtatt tttctgtcat caaacaaaac aggtatcagt gcattattaa atgaatattt 480 

aaattagaca ttaccagtaa tttcatgtct actttttaaa atcagcaatg aaacaataat 54 0 



600 
660 
720 



900 
960 



ttgaaatttc taaattcata gggtagaatc acctgtaaaa gcttgtttga tttcttaaag 
ttattaaact tgtacatata ccaaaaagaa gctgtcttgg atttaaatct gtaaaatcag 
atgaaatttt actacaattg cttgttaaaa tattttataa gtgatgttcc tttttcacca 

agagtataaa cctttttagt gtgactgtta aaacttcctt ttaaatcaaa atgccaaatt 780 

tattaaggtg gtggagccac tgcagtgtta tctcaaaata agaatatcct gttgagatat 840 
tccagaatct gtttatatgg ctggtaacat gtaaaaaccc cataaccccg ccaaaagggg 
tcctaccctt gaacataaag caataaccaa aggagaaaag cccaaattat tggttccaaa 

tttagggttt aaachttttg aagcaaactt ttttttagcc ttgtgcactg cagacctggt 1020 

actcagattt tgctatgagg ttaatgaagt accaagctgt gcttgaataa cgatatgttt lOBO 

tctcagattt tctgttgtac agtttaattt agcagtccat atcacattgc aaaagtagca 1140 

atgacctcat aaaatacctc ttcaaaatgc ttaaattcat ttcacacatt aattttatct 1200 

cagtcttgaa gccaattcag taggtgcatt ggaatcaagc ctggctacct gcatgctgbt 1260 

ccttttcttt tcttctttta gccattttgc taagagacac agtcttctca aacacttcgt 132 0 

ttctcctatt ttgttttact agttttaaga tcagagttca ctttctttgg actctgccta 13 80 

tattttctta cctgaactfct tgcaagtttt caggtaaacc tcagctcagg actgctattt 144 0 

agctcctctt aagaagatta aaaaaaaaaa aaaa 1474 



<210> 67 

<211> 99 

<212> DNA 

<213> Homo sapiens 

<400> 67 

gcgcccggcc cccacccctc gcagcacccc gcgccccgcg ccctcccagc cgggtccagc 60 
cggagccatg gggccggagc cgcagtgagc accatggag 99 



<210> 68 

<211> 614 

<212> DNA 

<213> Homo sapiens 

<:400> 68 

tgaaccagaa ggccaagtcc gcagaagccc tgatgtgtcc tcagggagca gggaaggcct 60 
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gacttctgct ggcatcaaga ggtgggaggg ccctccgacc acttccaggg gaacctgcca 12 0 

tgccaggaac ctgtcctaag gaaccttcct tcctgcttga gttcccagat ggctggaagg 180 

ggtccagcct cgttggaaga ggaacagcac tggggagtct ttgtggattc tgaggccctg 240 

cccaatgaga ctctagggtc cagtggatgc cacagcccag cttggccctt tccttccaga 300 

tcctgggtac tgaaagcctt agggaagctg gcctgagagg ggaagcggcc ctaagggagt 360 

gtctaagaac aaaagcgacc cattcagaga ctgtccctga aacctagtac tgccccccat 420 

gaggaaggaa cagcaatggt gtcagtatcc aggctttgta cagagtgctt ttctgtttag 480 

tttttacttt ttttgttttg tttttttaaa gacgaaataa agacccaggg gagaatgggt 540 

gttgtatggg gaggcaagtg tggggggtcc ttctccacac ccactttgtc catttgcaaa 600 

tatattttgg aaaa 514 



<210> 69 

<211> 36 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 69 

aaagtcgacg taatcgcgga ggcttggggc agccgg 36 



<210> 70 

<211> 30 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 70 

tttgcgactg gtcagctgcg ggatcccaag 30 



<210> 71 

<211> 33 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 71 

aagtcgacgt aagagctcca gagagaagtc gag 33 



<210> 72 

<211> 33 

<212> DNA 

<213> Artificial 
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<220> 

<223> Description of Artificial Sequence: Primer 

<400> 72 

aaacccgggc agcaaggcaa ggctccaatg cac 



<210> 73 

<211> 39 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: Primer 
<400> 73 

gccgggcagg aggaaggagc ctccctcagg gtttcggga 



<210> 74 

<211> 30 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 74 

ctgcactaga gacaaagacg tgatgttaat 30 



<210> 75 

<211> 66 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Polylinker 
<400> 75 

gaacaaatgt cgacgggggc ccctagcaga tctagcgctg gatcccccgg ggagctcaug 
gaagac 



<210> 76 

<211> 30 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 76 

cggtgttggg cgcgttattt atcggagttg 30 



<210> 77 
<211> 30 
<212> DNA 
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<213> Artificial 
<220> 

<223> Description of Artificial Sequence: Primer 

<400> 77 

ttggcgaaga atgaaaatag ggttggtact 



<210> 78 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 78 

ggtgaaggtc ggagtcaacg ga 22 



<210> 79 

<211> 21 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 79 

gagggatctc gctcctggaa g 21 



<210> 80 

<211> 55 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: Primer 

<400> 80 

aaagtcgacg taaccgccag atttgaatcg cgggacccgt tggcagaggt ggcgg 



<210> 81 

<211> 54 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 81 

aaaggatccg ggcaacgtcg gggcacccat gccgccgccg ccacctctgc caac 54 



<210> 82 

<211> 40 

<212> DNA 

<213> Artificial 
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<211> 31 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 87 

agcccatggt gctcactgcg gctccggccc c 



<210> 88 

<211> 22 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 88 

agactctgaa ccagaaggcc aa 22 



<210> B9 

<211> 36 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: Primer 
<400> 89 

ctcggtacca gttttccaaa atatatttgc aaatgg 36 



<210> 90 

<211> 58 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: Primer 

<400> 90 

cccaagcttc gcgcccggcc ccccacccct cgcagcaccc cgcgccccgc gccctccc 58 



<210> 91 

<211> 61 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Primer 

<400> 91 

ggccccatgg ctccggctgg acccggctgg gacccggctg ggagggcgcg ggagggcgcg 60 
g 61 
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<210> 92 

<211> 7008 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Expression Vector 



<400> 92 
y cLt^ y y d. u y y 


yctyciu^ L. <w> 


(—fa \~eT*c^r*\~^^ 
y CL L. L^cLL- 


ggtgca.ct c t 


cagtacaatc 


tgct ctgatg 


o U 


or* a a PT "H "h 


d-cty ^ cty LiCt ^ 


t" cir^ i" r* r* i~ rr 
wuy^ L>L<^^uy 


^ u uy uy uy u l> 


gyaggccgct 


gagtagtgcg 


ion 


fr-ra f~Tr^ a a = a i- 
L^y ciy v.. ddcict l. 


Lr L>CtCiy \^ UCL^CL 


a <^a a cxr'xc* a a 
a^aay y L«aay 


<-fi-i4" t"<-ra i~*ft'^ 

yci^cyaccga. 


Caa.Uli.gCaUg 


aagaat ctgc 


± O U 




y*— y I- 1- 1- *-y '-y 


\^ L>y^ Li Li^y^y 


cLL>y LiA^yyy ^ 


^ a f^a ^ a ^ a 
^ay ci Lia Lidcy 


L>y l« cy aCatC 


o A n 


rya ^ ^ a ^ ^ f*f a r* 


^ a a i" a a 

Li ay Li L«Cl Li LiCLCL 


4— a fri" a a 'f* 1^ a a 
uay uaa uLfda 


^ ^ a ffn-tftt i 1 *^ 

u uaL>y yy y LiL> 


a arrt--h ria 4- 
a Li L. ay Li L. LfO. C 


agcccatata 


J u u 




pert" 1~a pa t"a a 


pf" ^ a r*rte^f' s a 
^ L^ai^y y L-aa 


aL^yy^L^L^y^^ 


uy y Lf uy acsi^y 


^ ^ ^ cLa C y d C L> 


.sou 


rTT'crr'rT'a t" t" 


rra prrt* pa a "h a 


ai-ya^y LiaLiy 


L. L. ^ L> v.> a L. ay L. 


a a /"^fff* ^a a 1~ a 

aa^y ccadLia 


gggaccttcc 






o-i-yyy L.yya.y 


^ a +" t" +" a r'orr'H 
L.aL.l.< L-ciL>yy L. 


aa a r^^r^r^r^r*^ 
clctcLL^ uy (.^L-L-cL 


L. L. L.y y L.eLy Li a 


Cdt.CddgL.gU 


/ion 
cfc o u 


a 1" r* a 1" a "h csr* c 

a. xrfOi L«a Vif Vi« 


a afTl~a r*cTr>t^r* 
cLcLy LiCtoy ^ 


^Lrf L>a L> L>y av.>y 


LiU>aa uy av.>y y 


Lidaa Liyy ^(_>L> 


y ccuggcdU u 




a ■hcrrTT'a crt" a 

ClUp^^^wCL^ La CL 


pal" na (""pi" "h a 

^ClL>yCl^O Li LiCL 


i.yyya^ l^v^ 


uiLia^L. L.yy^a 


/Ti~ a/^a4~/^4~a/^ 
y l_CH_cl L.O UCLL. 


y L.dL. L.dy Cirfd 


b u u 


c t "bs. c 


G a t cf cr i* CT a 1" CI 


pcrcrt" "h 1~ 1" ctctc 
^yy ^ ^^^yyv 


a n"t~ a p a pa a 
ciy L>ctL>aL«^aa 


i-yggcgtgga 


ucty uy y l» u Ly 




3. c t c a. c 9" go'^ 


ai"'t"t"pP3aQ"t" 


*v k^v^^ao^^v>a 


t" t~rtaf^ol~(*^aa 
L-uya^yuL^aa 


1" oooa (~t4~ "t" 4~ o 

Lyyyagi-i-uy 


4~ 4~ 4~ 4" f~fri ft ^ c f> 
l_L_ UL^yyudUL. 


/ Z U 




y y L. L- i_ CL 


a aat-rT't~r^rTt" a 
cictcLL.y u^^y Uct 


a 4~ /~> /"I r~r 

ducicio ucoy o 


CCCaL.L.gaCg 


caaa'bgggcg 


7 D n 


y ^y ^ 


dt-yy Lgy y Siy 


y (..v^ ua L.a uaa 


i~f /"< a »~f a /fi^ 4~ 4" 

y caga.gcii.cu 


c tggc taac t 


aagcttiti egg 


Q /I A 


r* CT r* Cf r* p cr a CT cr 
y y^^y ciy y 


i" a p p a t" fTiTrf a 
ucL^^cLi.yyya, 


4— /-I r^a arra r*c^ 
Li^^y aay cLv^y 


CLfCLaaaaoa L. 


aaagaaaggc 


ccggcgccat 


Q n n 
y U U 


fcp'ha'lT'pt'pt* 

W VhiI l« law L> 


a da o<~ra "H rrrra 
ay cLy y CL u y y CL 


a(«iL*y ^ ^y y ^y 


ay L> aa^ uy c a 


L.aaggcuaug 


aagagatacg 


yo U 


ccctggttcc 


t" crcra a pa a 1" 1" 

Lm^^ CLCLV^Cia L- L- 


y L> L> L- u> L-aL<ay 


a trrpa pa t" a i" 
a L>y LaCiL>et L>ci l- 


^^r^a no 4" oa a ^ 
L.yayy uyaaL. 


dUv^id^y UdCy 




cggaatacfct 


ccra a a i" cri" p p 

w ^ Gi> aa Li ^ La ^ \^ 


y L. L. \.i«y y Li L.yy 


pana a rr^^t* a ^ 
u>ayaay^ L>aL> 


fra a a /"ir^ra 4~ a 4" 

y aaa^y a l. a u 


ggy cugadLia 


± u o u 


caaatcacag 


aatcgtcgta 


tgcagtgaaa 


actctcttca 


attctttatg 


ccggtgttgg 


1140 


gcgcgttatt 


tatcggagtt 


gcagttgcgc 


ccgcgaacga 


catttataat 


gaacgtgaat 


1200 


tgctcaacag 


tatgaacatt 


tcgcagccta 


ccgtagtgtt 


tgtttccaaa 


aaggggttgc 


1260 


aaaaaatttt 


gaacgtgcaa 


aaaaaattac 


caataatcca 


gaaaattatt 


atcatggatt 


1320 


ctaaaacgga 


ttaccaggga 


tttcagtcga 


tgtacacgtt 


cgtcacatct 


catctacctc 


1380 


ccggttttaa 


tgaatacgat 


tttgtaccag 


agtcctttga 


tcgtgacaaa 


acaattgcac 


1440 


tgataatgaa 


ttcctctgga 


tctactgggt 


tacctaaggg 


tgtggccctt 


ccgcatagaa 


1500 


ctgcctgcgt 


cagattctcg 


catgccagag 


atcctatttt 


tggcaatcaa 


atcattccgg 


15S0 
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atactgcgat 


tttaagtgtt 


gttccattcc 


atcacggttt 


tggaatgt 1 1 


actacactcg 


T iTO A 
±OZ U 


gatatttgat 


atgtggattt 


cgagtcgtct 


taatgtatag 


atttgaagaa 


gagctgtttt 


icon 


tacgatccct 


tcaggattac 


aaaattcaaa 


gtgcgttgct 


agtaccaacc 


c cacccccac 


J. /4 U 


tcttcgccaa 


aagcactctg 


attgacaaat 


acgatttatc 


taatttacac 


gaaattgctt 


T p A A 


ctgggggcgc 


acctctttcg 


aaagaagtcg 


gggaagcggt 


tgcaaaacgc 


ttccatcbtc 


T D ^ A 

J.OO U 


cagggatacg 


acaaggatat 


gggctcactg 


agactacatc 


agctattctg 


atztacacccg 


T on A 


agggggatga 


taaaccgggc 


gcggtcggta 


aagttgttcc 


ac uccccgaa 


gcgaaggttg 


T Q Q A 


tggatctgga 


taccgggaaa 


acgctgggcg 


ttaatcagag 


aggcgaatta 


tg'bg'tcagag 


o r\A r\ 


gacctatgat 


tatgtccggt 


tatgtaaaca 


at c cggaagc 


gaccaacgcc 


t tgat tgaca 


o 1 n n 

z xw 


aggatggatg 


gctacattct 


ggagacatag 


cttactggga 


cgaagacgaa 


cs-cttcttca. 


z ± O u 


tagttgaccg 


cttgaagtct 


ttaattaaat 


acaaaggata 


tcaggtggcc 


cccgctgaat 


o o o n 
z z z u 


tggaatcgah 


attgttacaa 


caccccaaca 


tcttcgacgc 


gggcgtggca 


ggtctfccccg 


z o U 


acgatgacgc 


cggtgaactt 


cccgccgccg 


ttgctgtttc 


ggagcacgga 


aagacgatga 


o A r» 


cggaaaaaga 


gatcgtggat 


tacgtcgcca 


gtcaagtaac 


aaccgcgaaa 


aagttgcgcg 


Z4 u u 


gaggagttgt 


gtttgtggac 


gaagtaccga 


aaggtcttac 


cggaaaactc 


gacgcaagaa 


Z4 o u 


aaatcagaga 


gatccbcata 


aaggc c aaga 




gtccaaattg 


cgcggccgct 


u 


aactcgagaa 


taaaatgagg 


aaattgcatc 


gcattgtctg 


agtaggtgtc 


auuc uauccu 


neon 


ggggggtggg 


gtggggcagg 


acagcaaggg 


ggaggattgg 


gaagacaata 


gcaggcatgc 


o c A 

z 04 U 


tggggatgcg 


gtgggctcta 


tggcttctga 


ggcggaaaga 


accagctggg 


gc t c t agggg 


z / UU 


gtatccccac 


gcgccctgta 


gcggcgcatt 


aagcgcggcg 




tfcacgcgcag 


z / OU 


cgtgaccgct 


acacttgcca 


gcgcGctagc 


gcccgctcct 




tccc ttcctfc 


n p o O 
z OZ U 


tctcgccacg 


ttcgccggct 


ttccccgtca 


agctctaaat 


cggg^ggc t c c 


ct tt agggt t 


o Q Q n 
z o o U 


ccgatttagt 


gctttacggc 


acctcgaccc 


caaaaaactt 


gattagggtg 


atggtt cacg 


D O A n 


tagtgggcca 


tcgccctgat 


agacggtttt 


tcgccctttg 


a. c ^ 1 1 9^9^ a. 9^ t 


ccacgt'tc'tti 


■3 A nn 
^ u uu 


taatagtgga 


ctcttgttcc 


aaactggaac 


aacactcaac 


cctatctcgg 


uctattCttc 




tgatttataa 


gggattttgc 


cgatttcggc 


ctattggtta 


a.E cia. a. 1 9 ^9 ^ 


tgatttaaca 


lion 
J ±z U 


aaaatttaac 


gcgaattaat 


tctgtggaat 


gtgtgt c agt 


tagggtgtgg 


aaagtcccca 


O T O A 


ggctccccag 


caggcagaag 


tatgcaaagc 


atgcatctca 


attagtcagc 


aaccaggtgt 


3240 


ggaaagtccc 


caggctcccc 


agcaggcaga 


agtatgcaaa 


gcatgcatct 


caattagtca 


3300 


gcaaccatag 


tcccgcccct 


aactccgccc 


atcccgcccc 


taactccgcc 


cagttccgcc 


3360 


cattctccgc 


cccatggctg 


actaattttt 


tttatttatg 


cagaggccga 


ggccgcctct 


3420 



47 



wo 2004/065561 



PCT/US2004/001643 



gcctctgagc 


tattccagaa 


gtagtgagga 


ggcttttttg 


gaggcctagg 


cttttgcaaa 


3480 


aagctcccgg 


gagcttgtat 


atccattttc 


ggatctgatc 


agcacgtgat 


gaaaaagcct 


3540 


gaactcaccg 


cgacgtctgt 


cgagaagttt 


ctgatcgaaa 


agttcgacag 


cgtctccgac 


3S00 


ctgatgcagc 


tctcggaggg 


cgaagaatct 


cgtgctttca 


gcttcgatgt 


aggagggcgt 


3660 


ggatatgtcc 


tgcgggtaaa 


tagctgcgcc 


gatggtttct 


acaaagatcg 


ttatgtttat 


3720 


cggcactttg 


catcggccgc 


gctcccgatt 


ccggaagtgc 


ttgacattgg 


ggaattcagc 


3780 


gagagcctga 


cctattgcat 


ctcccgccgt 


gcacagggtg 


tcacgttgca 


agacctgcct 


3840 


gaaaccgaac 


tgcccgctgt 


tctgcagccg 


gtcgcggagg 


ccatggatgc 


gatcgctgcg 


3900 


gccgatctta 


gccagacgag 


cgggttcggc 


ccattcggac 


cgcaaggaat 


cggtcaatac 


3960 


actacatggc 


gtgatttcat 


atgcgcgatt 


gctgatcccc 


atgtgtatca 


ctggcaaact 


4020 


gtgatggacg 


acaccgtcag 


tgcgtccgtc 


gcgcaggctc 


tcgatgagct 


gatgctttgg 


4080 


gccgaggact 


gccccgaagt 


ccggcacctc 


gtgcacgcgg 


atttcggctc 


caacaatgtc 


4140 


ctgacggaca 


atggccgcat 


aacagcggtc 


attgactgga 


gcgaggcgat 


gttcggggat 


4200 


tcccaatacg 


aggtcgccaa 


catcttcttc 


tggaggccgt 


ggttggcttg 


tatggagcag 


4260 


cagacgcgct 


acttcgagcg 


gaggcatccg 


gagcttgcag 


gatcgccgcg 


gctccgggcg 


4320 


tatatgcbcc 


gcattggtct 


tgaccaactc 


tatcagagct 


tggttgacgg 


caatttcgat 


4380 


gatgcagctt 


gggcgcaggg 


tcgatgcgac 


gcaatcgtcc 


gatccggagc 


cgggactgtc 


4440 


gggcgtacac 


aaatcgcccg 


cagaagcgcg 


gccgtctgga 


ccgatggctg 


tgtagaagta 


4500 


ctcgccgata 


gtggaaaccg 


acgccccagc 


actcgtccga 


gggcaaagga 


atagcacgtg 


4560 


ctacgagatt 


tcgattccac 


cgccgccttc 


tatgaaaggt 


tgggcttcgg 


aatcgttttc 


4620 


cgggacgccg 


gctggatgat 


cctccagcgc 


ggggatctca 


tgctggagtt 


cttcgcccac 


4680 


cccaacttgt 


ttattgcagc 


ttataatggt 


tacaaataaa 


gcaatagcat 


cacaaatttc 


4740 


acaaataaag 


catttttttc 


actgcattct 


agttgtggtt 


tgtccaaact 


catcaatgta 


4800 


tcttatcatg 


tctgtatacc 


gtcgacctct 


agctagagct 


tggcgtaatc 


atggtcatag 


4860 


ctgtttcctg 


tgtgaaattg 


ttatccgctc 


acaattccac 


acaacatacg 


agccggaagc 


4920 


ataaagtgta 


aagcctgggg 


tgcctaatga 


gtgagctaac 


tcacattaat 


tgcgttgcgc 


4980 


tcactgcccg 


ctttccagtc 


gggaaacctg 


tcgtgccagc 


tgcattaatg 


aatcggccaa 


5040 


cgcgcgggga 


gaggcggttt 


gcgtattggg 


cgctcttccg 


cttcctcgct 


cactgactcg 


5100 


ctgcgctcgg 


tcgttcggct 


gcggcgagcg 


gtatcagctc 


actcaaaggc 


ggtaatacgg 


5160 


ttatccacag 


aatcagggga 


taacgcagga 


aagaacatgt 


gagcaaaagg 


ccagcaaaag 


5220 
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gcc3.gga.a.cc 


gtaaaaaggc 


cgcgttgctg 


gcgtttttcc 


ataggctccg 


cccccctgac 


52 80 


Cf3.^C3tC3.C5. 


aaaatcgacg 


ctcaagtcag 


aggtggcgaa 


acccgacagg 


actataaaga 


5340 




^ ^ r'i z"*^ ffff 

t ucccccuy y 


aagctccctc 


gtgcgctctc 


ctgttccgac 


cctgccgctt 


5400 






tctcccttcg ggaagcgtgg 


cgctttctca 


tagctcacgc 


54S0 




tcagttcggt 


gtaggtcgtt 


cgctccaagc 


tgggctgtgt 


gcacgaaccc 


5520 


^ /T*fft" 4* a f^rt 


ccydCCy ci^y 


cgccttatcc 


ggtaactatc 


gtcttgagtc 


caacccggt:a 


55 80 


Cty CtV_. CL^M Ct^ L. 


i~ a t~ rrr** a ^ 


ggcagcagcc 


actggtaaca 


ggattagcag 


agcgaggt at 


5640 


gtaggcggtg 


oUdOdy dy U U 


cttgaagtgg 


tggcctaact 


acggctacac 


tagaagaaca 


5700 


/ "1 1- a "f" ^ ^ fTTT^ a 


L. L.y L^y u L< u 


gctgaagcca 


gttaccttcg 


gaaaaagagt 


tggtagctct 


57S0 




ddUdddwCdU 


cgctggtagc 


ggtttttttg 


tttgcaagca 


gcagattacg 


5820 


t~*r~^ f~* 3 2 :2 =3 =1 


a.3gga.t c t C3. 


agaagatcct 


ttgatctttt 


ctacggggtc 


tgacgctcag 


5880 


L.yycici.tjycLcLci. 


dv^ L'^.^d^^y L. L. d 


agggattttg 


gtcatgagat 


tatcaaaaag 


gatcttcacc 


5940 


^CL^ Cl U L« L. L> 


't~aaa't~'f~aaaa 
L. ddd L- L. dddd 


atgaagtttt 


aaatcaatct 


aaagtiatiata 


tgagtaaact 


60 00 




^ a a a 4* ^ 

y L. L.dL>Lrfdd uy 


cttaatcagt 


gaggcaccta 


tctcagcgat 


ctgCGCattt 


6060 


r^c^^r i~ ^a t* ^ a 
U L. (.rCL L. L> ^ Ct 


uay k. t.y ccuy 


actccccgtc 


gtgtagataa 


ctacgatacg 


ggagggctta 


6120 


a 1~ ^ t" f^nr* f* 


c c a.y y c L. y c 


aatgataccg 


cgagacccac 


gctcaccggc 


tccagattta 


6180 




a ^ 9 ftf^r* a nf* 

dL^Cay CCay C 


cggaagggcc 


gagcgcagaa 


gtggtcctgc 


aactttatcc 


6240 


^ 1^ ^ V«> CL 


a frt" ^ a ^ ^ a a 
dy L.i» L-d L.dd 


ttgttgccgg gaagctagag 


taagtagtztc 


gccagttaat 


6300 


a ct^ H~fTr*rTr'a 
dy L>L> uyL^y^d 


d^y u L.y L. uy 0 


cattgctaca ggcatcgtgg 


tgtcacgctc 


gtcgtttggt 


6360 


fi \~ cscir* t" a "h 

R> w w CL l_ 


■|~ a nr* t~ i^rrrr 
L^L^dy L-ov^yy 


ttcccaacga 


tcaaggcgag 


ttacatgatc 


ccccatgtitg 


6420 


tcrnaa aa aacr 


^ y y L, L. dy 


cttcggtcct 


ccgatcgttg 


tcagaagtaa 


gttggccgca 


6480 


y L-yL. L^du^d^ 


L.L^ d Ly y L- L-d L- 


ggcagcactg 


cataattctc 


ttactgtcat 


gccatccgta 


654 0 


ai~Ta 'l~f~ri~''H*f"'I~'H 
dy duyuuL-UL 


ctgtgactgg 


tgagtactca 


accaagtcat 


tctgagaata 


gtgtatgcgg 


6500 


v-'y^Vrf^rfydy l. u 




ggcgtcaata 


cgggataata 


ccgcgccaca 


tagcagaact 


6660 


^t"a aa s rrl" 

l> L. ddddy U-y L> 


^CaUCaU ugg 


aaaacgttct 


tcggggcgaa 


aactctcaag gatcttaccg 


6720 


ctgttgagat 


ccagttcgat 


gtaacccact 


cgtgcaccca 


actgatcttc 


agcatctttt 


6780 


actttcacca 


gcgtttctgg 


gbgagcaaaa 


acaggaaggc 


aaaatgccgc 


aaaaaaggga 


6840 


ataagggcga 


cacggaaatg 


ttgaatactc 


atactcttcc 


tttttcaata 


ttattgaagc 


6900 


atttatcagg gttattgtct 


catgagcgga 


tacatatttg 


aatgtattta gaaaaataaa 


6960 


caaatagggg 


ttccgcgcac 


atttccccga 


aaagtgccac 


ctgacgtc 




7008 
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<210> 93 

<211> 11693 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Expression Vector 

<400> 93 



gttgacattg 


attattgact 


agttattaat 


agtaatcaat 


tacggggtca 


ttagttcata 


SO 


gcccatatat 


ggagttccgc 


gttacataac 


ttacggtaaa 


tggcccgcct 


ggctgaccgc 


120 


ccaacgaccc 


ccgcccattg 


acgtcaataa 


tgacgtatgt 


tcccatagta 


acgccaatag 


180 


ggactttcca 


ttgacgtcaa 


tgggtggagt 


atttacggta 


aactgcccac 


ttggcagtac 


240 


atcaagtgta 


tcatatgcca 


agtccgcccc 


ctattgacgt 


caatgacggt 


aaatggcccg 


300 


cctggcatta 


tgcccagtac 


atgaccttac 


gggactttcc 


tacttggcag 


tacatctacg 


360 


tattagtcat 


cgctattacc 


atggtgatgc 


ggttttggca 


gtacaccaat 


gggcgtggat 


420 


agcggtttga 


ctcacgggga 


tttccaagtc 


tccaccccat 


tgacgtcaat 


gggagtttgt 


480 


tttggcacca 


aaatcaacgg 


gactttccaa 


aatgtcgtaa 


taaccccgcc 


ccgttgacgc 


540 


aaatgggcgg 


taggcgtgta 


cggtgggagg 


tctatataag 


cagagctcgt 


ttagtgaacc 


600 


gtaagctttc 


ggcgcgccac 


ggtaccatgg 


gatccgaaga 


cgccaaaaac 


ataaagaaag 


660 


gcccggcgcc 


attctatcct 


ctagaggatg 


gaaccgctgg 


agagcaactg 


cataaggcta 


720 


tgaagagata 


cgccctggtt 


cctggaacaa 


ttgcttttac 


agatgcacat 


atcgaggtga 


780 


acatcacgta 


cgcggaatac 


ttcgaaatgt 


ccgttcggtt 


ggcagaagct 


atgaaacgat 


840 


atgggctgaa 


tacaaatcac 


agaatcgtcg 


tatgcagtga 


aaactctctt 


caattcttta 


900 


tgccggtgtt 


gggcgcgtta 


tttatcggag 


ttgcagttgc 


gcccgcgaac 


gacatttata 


960 


atgaacgtga 


attgctcaac 


agtatgaaca 


tttcgcagcc 


taccgtagtg 


tttgtttcca 


1020 


aaaaggggtt 


gcaaaaaatt 


ttgaacgtgc 


aaaaaaaatt 


accaataatc 


cagaaaatta 


1080 


ttatcatgga 


ttctaaaacg 


gattaccagg 


gatttcagtc 


gatgtacacg 


ttcgtcacat 


1140 


ctcatctacc 


tcccggtttt 


aatgaatacg 


attttgtacc 


agagtccttt 


gatcgtgaca 


1200 


aaacaattgc 


actgataatg 


aattcctctg 


gatctactgg 


gttacctaag 


ggtgtggccc 


1260 


ttccgcatag 


aactgcctgc 


gtcagattct 


cgcatgccag 


agatcctatt 


tttggcaatc 


1320 


aaatcattcc 


ggatactgcg 


attttaagtg 


ttgttccatt 


ccatcacggt 


tttggaatgt 


1380 


ttactacact 


cggatatttg 


atatgtggat 


ttcgagtcgt 


cttaatgtat 


agatttgaag 


1440 


aagagctgtt 


tttacgatcc 


cttcaggatt 


acaaaattca 


aagtgcgttg 


ctagtaccaa 


1500 


ccctattttc 


attcttcgcc 


aaaagcactc 


tgattgacaa 


atacgattta 


tctaatttac 


1560 
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ccccgggggc 


ct r*\' ("•4- 4— 4~ 
yL.<cl^^L.wL- U-L. 


cgaaagaagt 


c 9959^ ^9 ^9 


y L.. L^y L.ddddv' 


X 0 ^ u 


Cf c 1 C C 3.1^ c t 


+* /"> 3 nrrrr a 4~ a 

L.OL.ciyy y cL l-o. 


r^na f a a rrrra 4" 
L^y dvoddyy dL, 


dcygy c ucac 


Cy dy dL' L^dL>d 


4~i^anr^t"a+"4~r' 
\^ dy L> L. d L. L. ^ 


-i, D 0 U 


t g* a. 1 1 a. c 3. c c 


cgagggggat 


*~f a 4" a a a r^r^nn 

y d udddL>L<yy 


Scgcggtcgg 


4" a a « ^rt^ 4" 

caaagt.t.y t. u 


nr«a4-4-4-4-4-4-rr 
^^dL.LiL.UU L.y 


1 "7 A n 


a.a.9^ c 9^ a a-Q^^t 


tgtggatc tg 


gacaccggga 


aaacgc tiggg 


/~i*~r4- 4- a a 4— a**T 

cgu L.ad uL^dy 


a fra rT*~f /~<i~ra a 4~ 

dy dyy L^gdaL. 


T Q n n 


tatcftgtcag 


aggacctatg 


a 1. 1. a t.g t-c eg 


gttatgtaaa 


^ ^ ^ ^ ft ft a a 

caatccyy d.d 


^^/^/^a ^ ^ a a ft ft 

y L.y dccaacg 


1 Q CO 


cctt9^at'bga 


c aaggatgga 


tggctaca't't 


cLggagacat 


age L.t-act.gy 


gacgaagacg 


T on n 


a.a.C aU Tii> t. C U t 


a 4~ a ^/~ra 

^a.ua.y L. uy 


^nr* ^ t" r^a a ^4" 
^y L> L. L.y ddy l. 


aa^^aa 
s... L. L. Udd L. L.dd 


a a ^ a a a rr^a 
d L.dL>dddy y d 


^ a ^ ^ a ft fit" ftfr 
L> d L. ^ dy y L. y y 


xyo\j 


cccccgcbga 


attggaatcg 


•a 4" a 4" 4~ ^^4~ 4~ a ^ 

a ua u ugui^ac 


aacaccccaa 


^ a ^ 4" f*ft^ f* 

cauc L.i^cy dC 


ft ft ft ft ft ft ft ^ ftfi 

gcgggcgcgg 


0 n A n 




^^^a r^ri^ 4~^^a 

eg acy auy ac 


gccggugddC 


L. UL.L.L.y L.L.y 


err 4- 4- n 4- 4- rT4- 1* 
L.y L. L.y u L.y u l. 


4" 4* ftft^ ftfts ft ft 

i-tggagcacg 


Z J. u u 




gacggaaaaa 


gagatcgt gg 


at t acgtcgc 


/"< a a a ^^4~ a 

Cdy t-caay T-d 


a a a fftft frs 

dCddCCgCy a 


^ xo u 


3. 3.3. cLQ' 1 1 C 




gt gt t tgt gg 


3 /tff fa ^ i^4~ a ^ 
ctCyddy L-dL-O 


»~T a a a +~ +■ +■ 

y dddy y ll. u l 


ft f t~tft a a a a ^ 
dL.LyyddddL. 


0 9 9 n 

Z .ii z u 


t cgacgcaa9^ 


3.aaaat caga 


gagatccbca 


taaaggccaa 


gaagggcgga 


aagt c caaat 


z z 0 u 




4~ a 3 f*i 4~ /~<#*Ta /~t 
C LadL. L.L.y dy 


aa4~aaa^a3 rr 
ddL.dddL.ddy 


4~4~aa^aaf^aa 
L L.ddL.ddOdd 


a a t" ^ftf* a 4~ 4~ 
L.dd L. L.y L.d L. L. 


a 4- i- 4- 4- a 4- rri- 
L.dLL-L.UdL.yL. 


^ 7 A n 

Z J 4 u 


^ ^ a rrrt'^ ^ ^ a 
L. L. ^ o.^ ^ L. L. ^ a. 


333993-3g^y 


^ggg^gg^ 


^ ^ ^ a a a cxf^ a a 
L. U L. dddy L> dd 


rT4~aa aa/^o4"/^ 
y L.ddddL>L> L.^ 


4" a a a a ^ft^fi 
L>dL.dddL.y L»y 




3 Ti. a I. g g C 11. 3 a. 


ii.T^a.^gauccy 


y cugcd-cyc 


y cy ttt uggt 


ft a '^" ft a ft ft 4~ *~r 

y atgacygty 


aaaaf~i/^4~/^ i~ft 

ddddLiL* UL> uy 


^ 4 D U 


a. w el. L> a u g c a.g 


^ 4" ^ ^ ^ ffff a ^ a 

c t. c c cggaga 


uygx-cacagc 


4* 4" *^ 4" ^ 4" ^4" a a 

L-UyL-L^UyUi dd 


y^^^/^a 4* f\ft 

y uggatgccy 


ffft^ ft ft a ^a f* a 

y y ay uaycLca 




agcccgt cag 


gcgtcagcgg 


gtgttggcgg 


gt g t cggggc 


gcagccatga 


ggt cgact ck 


0 n n 
z D 0 u 


a.gag2 a t. c g a 


tgccccgccc 


^ /■.fjf ^ /^^*ra ^ 

cygauy aacu 


a a a ^ /^+* ^a 

aaacctgacL. 


a ft ft ^ ft ^ ^ ft 

acgacaucL-c 


^ ft ft ft ft ft^ ^ f\^ 

gC C L> C L. L. C L. 


^ 0^ u 


U(^^ ^^^^^ (.<a 


/rt" ^^^^ a 4" a a 

y L.y ^d L.y udd 


L. L> L> L> L. L. ^dy L. 


uy y L. L.y y L.d^ 


ddL> L. L>y L>L^dd 


L> L.yyy LfL.L' L.y 






gdCdcgyggg 


y y y d Lf L^ dd dL^ 


a r< a a a nt^nr^lr 
dCdddgyygu 


4" r^4~ 4" ft SI ft\'ft 
L.L^LL^ UydL- U-y 


4" a fT4" t"^a ^ a ^ 
Udy L. Uy dL.dL- 


^ / D L/ 


LfOU UCtUdCLCtL. 


(~r/~ra 4" i^fH /~if^ a 

yyciuy i-.yt^civ_ 


d L. L Uy L. L.ddL. 


dL. L.y ciy i^y y u 


^^^/^a4~^^4~ ft 

L.L.L.LrfdL'L^L' Uy 


y dy \^ dy dL^ l. u 


z 0 z u 




/~T(^a 0 4— /~ri~i a r3 

yycicL-yL-ctciL. 


T3/~<r3a/~<a+~4~ 
dL-ddL-dLLyC 


1" -f- i- 1- a t- rri- rrl- 


aac t ct tggc 


4~ ct a a i~r (~t "H "H 4~ 

Ly ddy L- 1_U U L 


9 s R n 

Z 0 c3 U 




tggg999-cat 


gtaccbccca 


9999C3Ccagg 


aagacbacgg 


gaggctacac 


9 Q A n 

Z y -tb U 




c agaggggcc 


tgtgtagcta 


c cga t aagcg 


gaccctcaag 


agggcatztag 


J u u u 




L> d L> d dy y w 


^^^^^Vf^4*4~aa^ 
L>Lf L. L.y L. Udd^ 


^ ^ 4" a a a r^frorr 
UdddL.yyy 


^ a a 4" Q 4~ ft ft 

UdyCdL.dL.yC 


4" 4" ^ ^ ft ft ft ft ^ a 

L. L. u C cy y y Ud 


0 U D L/ 


^ii.agcciL.aua 


c uducuagau 


taaccctaat 


4~ /~t a a a fxf* a 4~ 

L.caaL.ay cau 


at g ti. ti a c c ca 


acgggaagca 


J XZ U 


t" a t" rrr^ 4* a 4* r*n 
uetu^^UaL. oy 


a a ^ ^ a nnrt^ 4" 

ddL. udy y y u u 


Q ^^4" a a a a ftrTft 

dy uddddy yy 


^ ^ a a /^f^a a 
L. ^ U L. ddy y dd 


ft a ftftff^ \- \- ft 

cdycgauauc 


^ftftfi'Sftftftft'^ 

ucccacccca 


71 C30 
0 XO U 


tgagctgtca 


cggttttatt 


tacatggggt 


caggattcca 


cgagggtagt 


gaaccatttt 


3240 


agtcacaagg 


gcagtggctg 


aagatcaagg 


agcgggcagt 


gaactctcct 


gaatcttcgc 


3300 


ctgcttcttc 


attctccttc 


gtttagctaa 


tagaataact 


gctgagttgt 


gaacagtaag 


3360 


gtgtatgtga 


ggtgctcgaa 


aacaaggttt 


caggtgacgc 


ccccagaata 


aaatttggac 


3420 
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ggggggttca gtggtggcat tgtgctatga caccaatata accctcacaa accccttggg 3480 

caataaatac tagtgtagga atgaaacatt ctgaatatct ttaacaatag aaatccatgg 3540 

ggtggggaca agccgtaaag actggatgtc catctcacac gaatttatgg ctatgggcaa 3600 

cacataatcc tagtgcaata tgatactggg gttattaaga tgtgtcccag gcagggacca 3 6 SO 

agacaggtga accatgttgt tacactctat ttgtaacaag gggaaagaga gtggacgccg 3 72 0 

acagcagcgg actccactgg ttgtctctaa cacccccgaa aattaaacgg ggctccacgc 3780 

caatggggcc cataaacaaa gacaagtggc cactcttttt tttgaaafctg tggagtgggg 3840 

gcacgcgtca gcccccacac gccgccctgc ggttttggac tgtaaaataa gggtgtaata 3900 

acttggctga ttgtaacccc gctaaccact gcggtcaaac cacttgccca caaaaccact 3960 

aatggcaccc cggggaatac ctgcataagt aggtgggcgg gccaagatag gggcgcgatt 4020 

gctgcgatct ggaggacaaa ttacacacac ttgcgcctga gcgccaagca cagggttgtt 4080 

ggtcctcata ttcacgaggt cgctgagagc acggtgggct aatgttgcca tgggtagcat 4140 

atactaccca aatatctgga tagcatatgc tatcctaatc tatatctggg hagcataggc 42 00 

tatcctaatc tatatctggg tagcatatgc tatcctaatc tatatctggg tagtatatgc 4260 

tatcctaatt tatatctggg tagcataggc tatcctaatc tatatctggg tagcatatgc 4320 

tatcctaatc tatatctggg tagtatatgc tatcctaatc tgtatccggg tagcatatgc 43 80 

tatcctaata gagattaggg tagtatatgc tatcctaatt tatatctggg tagcatatac 4440 

tacccaaata tctggatagc atatgctatc ctaatctata tctgggtagc atatgctatc 4500 

ctaatctata tctgggtagc ataggctatc ctaatctata tctgggtagc atatgctatc 4560 

ctaatctata tctgggtagt atatgctatc ctaatttata tctgggtagc ataggctatc 4620 

ctaatctata tctgggtagc atatgctatc ctaatctata tctgggtagt atatgctatc 4680 

ctaatctgta tccgggtagc atatgctatc ctcatgcata tacagtcagc atatgatacc 4740 

cagtagtaga gtgggagtgc tatcctttgc atatgccgcc acctcccaag ggggcgtgaa 4800 

ttttcgctgc ttgtcctttt cctgctggtt gctcccattc ttaggtgaat ttaaggaggc 4860 

caggctaaag ccgtcgcatg tctgattgct caccaggtaa atgtcgctaa tgttttccaa 4920 

cgcgagaagg tgttgagcgc ggagctgagt gacgtgacaa catgggtatg cccaattgcc 4980 

ccatgttggg aggacgaaaa tggtgacaag acagatggcc agaaatacac caacagcacg 5040 

catgatgtct actggggatt tattctttag tgcgggggaa tacacggctt ttaatacgat 5100 

tgagggcgtc tcctaacaag ttacatcact cctgcccttc ctcaccctca tctccatcac 5160 

ctccttcatc tccgtcatct ccgtcatcac cctccgcggc agccccttcc accataggtg 5220 
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gaaaccaggg 


aggcaaatct 


actccatcgt 




r'^r*acit' cape 




c; 9 R n 
-J ^ o u 


aggtaggagc gggctttgtc 


ataacaaggt 








O o re U 


atatatgagt 


ttgtaaaaag 


accatgaaat 


a a r* 3 cts a a i" 




o-y^gygccag 


c A n n 


gttgtgggcc 


gggtccaggg gccattccaa 


a nrtCTf^si rra rf 

y y y «-y s-cg 


actcaatggt 


gtaagacgac 




attgtggaat 


agcaagggca gttcctcgcc 




aagggaggtc 


ttactacctc 




catatacgaa 


cacaccggcg 


acccaagttc 


cttcgtcggt 


agtcctttct 


acgtgactcc 


p n 


tagccaggag agctcttaaa 


ccttctgcaa 


tgttctcaaa 


tttcgggttg gaacctcctt 


O Drr U 


gaccacgatg 


cttttccaaa 


ccaccctcct 


tttttgcgcc 


ctgcctccat 


caccctgacc 


^ 7 n n 


ccggggtcca gtgcttgggc 


cttctcctgg gtcatctgcg gggccctgct 


ctatcgctcc 


D / o\) 


cgggggcacg 


tcaggctcac 


catctgggcc 


accttcttgg 


tggtattcaa 


aataatcggc 


c n 0 n 


ttcccctaca 


gggtggaaaa 


atggccttct 


acctggaggg 


ggcctgcgcg 


gtggagaccc 


c; Q Q 
D O D u 


ggatgatgat 


gactgactac 


tgggactcct 


gggcctcttt 


tctccacgtc 


cacgacctct 


c Q A n 


ccccctggct 


ctttcacgac 


ttccccccct 


ggctctttca 


cgtcctctac 


cccggcggcc 


O V u u 


tccactacct 


cctcgacccc 


ggcctccact 


acctcctcga 


ccccggcctc 


cactgcctcc 


o U O U 


tcgaccccgg cctccacctc 


ctgctcctgc 


ccctcctgct 


cctgcccctc 


ctcctgctcc 




tgcccctcct gcccctcctg ctcctgcccc 


tcctgcccct 


cctgctcctg 


cccctcctgc 


SI fin 
D J. o u 


ccctcctgct 


cctgcccctc 


ctgcccctcc 


tcctgctcct 


gcccctcctg 


cccctcctcc 




tgctcctgcc 


cctcctgccc 


ctcctgctcc 


tgcccctcct 


gcccctcctg 


ctcctgcccc 


O J u u 


tcctgcccct cctgctcctg cccctcctgc 


tcctgcccct 


cctgctcctg 


cccctcctgc 


D J o u 


tcctgcccct 


cctgcccctc 


ctgcccctcc 


tcctgctcct 


gcccctcctg 


ctcctgcccc 


AO n 

Orx ^ U 


tcctgcccct 


cctgcccctc 


ctgctcctgc 


ccctcctcct 


gctcctgccc 


ctcctgcccc 


Ore O U 


tcctgcccct 


cctcctgctc 


ctgcccctcc 


tgcccctcct 


cctgctcctg 


cccctcctcc 




tgctcctgcc 


cctcctgccc 


ctcctgcccc 


tcctcctgct 


cctgcccctc 


ctgcccctcc 


n n 
t) o U U 


tcctgctcct 


gcccctcctc 


ctgctcctgc 


ccctcctgcc 


cctcctgccc 


ctcctcctgc 


O O O L/ 


tcctgcccct 


cctcctgctc 


ctgcccctcc 


tgcccctcct 


gcccctcctg 


cccctcctcc 


o / ^ u 


tgctcctgcc 


cctcctcctg 


ctcctgcccc 


tcctgctcct 


gcccctcccg 


ctcctgctcc 


o / 0 U 


tgctcctgtt 


ccaccgtggg 


tccctttgca 


gccaatgcaa 


cttggacgtt 


tttggggtct 


Do4U 


ccggacacca 


tctctatgtc 


ttggcGctga 


tcctgagccg 


cccggggctc 


ctggtcttcc 


6900 


gcctcctcgt 


cctcgtcctc 


ttccccgtcc 


tcgtccatgg 


ttatcacccc 


ctcttctttg 


6960 


aggtccactg 


ccgccggagc 


cttctggtcc 


agatgtgtct 


cccttctctc 


ctaggccatt 


7020 


tccaggtcct gtacctggcc 


cctcgtcaga 


catgattcac 


actaaaagag 


atcaatagac 


7080 
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atctttatta 


gacgacgctc 


agtgaataca 


gggagtgcag 


actcctgccc 


cctccaacag 


7140 


cccccccacc 


ctcatcccct 


tcatggtcgc 


tgtcagacag atccaggtct 


gaaaattccc 


7200 


catcctccga 


accatcctcg 


tcctcatcac 


caattactcg 


cagcccggaa 


aactcccgct 


7260 


gaacatcctc 


aagatttgcg 


tcctgagcct 


caagccaggc 


ctcaaattcc 


tcgtccccct 


7320 


ttttgctgga 


cggtagggat 


ggggattctc 


gggacccctc 


ctcttcctct 


tcaaggtcac 


7380 


cagacagaga 


tgctactggg 


gcaacggaag 


aaaagctggg 


tgcggcctgt 


gaggatcagc 


7440 


ttatcgatga 


taagctgtca 


aacatgagaa 


ttcttgaaga 


cgaaagggcc 


tcgtgatacg 


7500 


cctatttfcta 


taggttaatg 


tcatgataat 


aatggtttct 


tagacgtcag 


gtggcacttt 


75S0 


tcggggaaat 


gtgcgcggaa 


cccctatttg tttatttttc 


taaatacatt 


caaatatgta 


7620 


tccgctcatg 


agacaataac 


cctgataaat 


gcttcaataa 


tattgaaaaa 


ggaagagtat 


7680 


gagtattcaa 


catttccgtg 


tcgcccttat 


tccctttttt 


gcggcatttt 


gccttcctgt 


7740 


ttttgctcac 


ccagaaacgc 


tggtgaaagt 


aaaagatgct 


gaagatcagt 


tgggtgcacg 


7800 


agtgggttac 


atcgaactgg 


atctcaacag cggtaagatc 


cttgagagtt 


ttcgccccga 


7860 


agaacgtttt 


ccaatgatga 


gcacttttaa 


agttctgcta 


tgtggcgcgg 


tattatcccg 


7920 


tgttgacgcc 


gggcaagagc 


aactcggtcg ccgcatacac 


tattctcaga 


atgacttggt 


7980 


tgagtactca 


ccagtcacag 


aaaagcatct 


tacggatggc 


atgacagtaa 


gagaattatg 


8040 


cagtgctgcc 


ataaccatga 


gtgataacac 


tgcggccaac 


ttacttctga 


caacgatcgg 


8100 


aggaccgaag 


gagctaaccg 


cttttttgca 


caacatgggg 


gatcatgtaa 


ctcgccthga 


8160 


tcgttgggaa 


ccggagctga 


atgaagccat 


accaaacgac 


gagcgtgaca 


ccacgatgcc 


8220 


tgcagcaatg gcaacaacgt 


tgcgcaaact 


attaactggc 


gaactactha 


ctctagcttc 


8280 


ccggcaacaa 


ttaatagact 


ggatggaggc 


ggataaagtt 


gcaggaccac 


ttctgcgctc 


8340 


ggcccttccg gctggctggt 


ttattgctga 


taaatctgga 


gccggtgagc 


gtgggtctcg 


8400 


cggtatcatt 


gcagcactgg 


ggccagatgg 


taagccctcc 


cgtatcgtag 


ttatctacac 


8460 


gacggggagt 


caggcaacta 


tggatgaacg 


aaatagacag 


atcgctgaga 


taggtgcctc 


8520 


actgattaag cattggtaac 


tgtcagacca 


agtttactca 


tatatacttt 


agattgattt 


8580 


aaaacttcat 


ttttaattta 


aaaggatcta ggtgaagatc 


ctttttgata atctcatgac 


8640 


caaaatccct 


taacgtgagt 


tttcgttcca 


ctgagcgtca 


gaccccgtag 


aaaagatcaa 


8700 


aggatcttct 


tgagatcctt 


tttttctgcg cgtaatctgc 


tgcttgcaaa 


caaaaaaacc 


8760 


accgctacca 


gcggtggttt 


gtttgccgga 


tcaagagcta 


ccaactcttt 


ttccgaaggt 


8820 


aactggcttc 


agcagagcgc 


agataccaaa 


tactgtcctt 


ctagtgtagc 


cgtagttagg 


8880 
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cc3cca.cttc 


aacraactctq 


tagcaccgcc 


tacatacctc 


qctctgctaa 
3 


tcctgttacc 


8940 


agtggctgct 


qccaqtagcCT 


ataagtcgtg 


tcttaccggg 


ttggactcaa 


gacgatagtt 


9000 


ac cgga, t a.a.5 


qcqc aqcqqt 


cgggctgaac 


qqqqqqt t C q 


tgcacacagc 


ccagcttgga 


9060 




tacaccgaac 


tgagatacct 


acagcgtgag 


ctatgagaaa 


gcgccacgct 


9120 




aqaaaqqcqq 


acaggtatcc 


ggt aagcggc 


agggtcggaa 


caqqaqaqcq 


9180 




cttccagggg 


gaaacgcctg 


gtatctttat 


agtcctgtcg 


ggtttcgcca 


9240 


cctctgactt 


qaqcqtcqat 


ttttgtgatg 


Ctcqtcaqqq 

*^*^ 3 333 


qqqcqqaqcc 


tatggaaaaa 


9300 


cgccagcaac 


gcggcctttt 


tacggttcct 


ggccttttgc 


tggccttgaa 


gctgtccctg 


9360 


atcrcrtccrfcca 

hpaV^ WM 


tctacctgcc 


tggacagcat 


ggcctgcaac 


gcgggcatcc 

wJ ,J WmJ ^ 


cgatgccgcc 


9420 


crcraa.ctcaa.cfa. 


agaatcataa 


tqqqqaaqqc 
^3333 33 


catccagcct 


cgcgtcgcga 


acgccagcaa 


9480 


gacgfcagccc 


agcgcgtcgg 


ccccgagatg 


cqccqcqtqc 


ggctgctgga 


gatggcggac 


9540 


accratcraata 


tgttctgcca 


acrqqttqqtt 


tgcgcattca 


cagttctccg 


caagaattga 


9600 


ttggctccaa 


ttcttggagt 


ggtgaatccg 


ttagcgaggt 


gccgccctgc 


ttcatccccg 


9660 


tggcccgttg 


ctcgcgtttg 


ctqqcqqtqt 


ccccggaaga 


aatatatttg 


catgtcttta 


9720 


gttctatgat 


gacacaaacc 


ccgcccagcg 


tcttgtcatt 


ggcgaafctcg 


aacacgcaga 


9780 


tcicaQtcctcjq 


QCqqcqcqqt 

3^33^3 33 


ccgaggtcca 


cttcgcatat 


taaggtgacg 


cqtqtqqcct 


9840 


cgaacaccga 


gcgaccctgc 


agcgacccgc 


ttaacagcgt 


caacagcgtg 


ccgcagatcc 


9900 




tgagatatga 


aaaagcctga 


actcaccgcg 


acgtctgtcg 


agaagtttct 


9960 


gatcgaaaag 


ttcgacagcg 


tctccgacct 


gatgcagctc 


tcggagggcg 


aagaatctcg 


10020 


tgctttcagc 


ttcgatgtag 


qaqqgcqtgq 

33333 33 


atatgtcctg 


cgggtaaata 


gctgcgccga 


10080 


tggtttctac 


aaagatcgtt 


atgtttatcg 


gcactttgca 


tcggccgcgc 


tcccgattcc 


10140 


Qcraacrtcrctt 


gacattgggg 


aattcagcga 


gagcctgacc 


tattgcatct 


cccgccgtgc 


10200 


acagggtgtc 


acgttgcaag 


acctgcctga 


aaccgaactg 


cccgctgttc 


tgcagccggt 


10260 


ccrccrcracrcfcc 


atggatgcga 


tcgctgcggc 


cgatcttagc 


caqacqaqcq 


qqttcqgccc 


10320 


attcggaccg 


caaggaatcg 


gtcaatacac 


tacatggcgt 


gatttcatat 


gcgcgattgc 


10380 


tgatccccat 


gtgtatcact 


ggcaaactgt 


gatggacgac 


accgtcagtg 


cgtccgtcgc 


10440 


gcaggctctc 


qatqaqctqa 


tqctttqqqc 

*-3^«-^*-333"-* 


cgaggactgc 


cccgaagtcc 


ggcacctcgt 


10500 


gcacgcggat 


ttcggctcca 


acaatgtcct 


gacggacaat 


ggccgcataa 


cagcggtcat 


10560 


tgactggagc 


gaggcgatgt 


tcggggattc 


ccaatacgag 


gtcgccaaca 


tcttcttctg 


10620 


gaggccgtgg 


ttggcttgta 


tggagcagca 


gacgcgctac 


ttcgagcgga 


ggcatccgga 


10680 


gcttgcagga 


tcgccgcggc 


tccgggcgta 


tatgctccgc 


attggtcttg 


accaactcta 


10740 
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tcagagcttg 


gttgacggca 


atttcgatga 


tgcagcttgg 


gcgcagggtc 


gatgcgacgc 


10800 


aatcgtccga 


tccggagccg 


ggactgtcgg 


gcgtacacaa 


atcgcccgca 


gaagcgcggc 


10860 


cgtctggacc 


gatggctgtg 


tagaagtact 


cgccgatagt 


ggaaaccgac 


gccccagcac 


10920 


tcgtccggat 


cgggagatgg 


gggaggctaa 


ctgaaacacg 


gaaggagaca 


ataccggaag 


10980 


gaacccgcgc 


tatgacggca 


ataaaaagac 


agaataaaac 


gcacgggtgt 


tgggtcgttt 


11040 


gttcataaac 


gcggggttcg 


gtcccagggc 


tggcactctg 


tcgatacccc 


accgagaccc 


11100 


cattggggcc 


aatacgcccg 


cgtttcttcc 


ttttccccac 


cccacccccc 


aagttcgggt 


11160 


gaaggcccag 


ggctcgcagc 


caacgtcggg 


gcggcaggcc 


ctgccatagc 


cactggcccc 


11220 


gtgggttagg 


gacggggtcc 


cccatgggga 


atggtttatg 


gttcgtgggg 


gttattattt 


11280 


gggcgttgcg 


tggggtcagg 


tccacgactg 


gactgagcag 


acagacccat 


ggtttttgga 


11340 


tggcctgggc 


atggaccgca 


tgtactggcg 


cgacacgaac 


accgggcgtc 


tgtggctgcc 


11400 


aaacaccccc 


gacccccaaa 


aaccaccgcg 


cggatttctg 


gcgtgccaag 


ctagtcgacc 


11460 


aattctcatg 


tttgacagct 


tatcatcgca 


gatccgggca 


acgttgttgc 


cattgctgca 


11520 


ggcgcagaac 


tggtaggtat 


ggaagatcta 


tacattgaat 


caatattggc 


aattagccat 


11580 


attagtcatt 


ggttatatag 


cataaatcaa 


tattggctat 


tggccattgc 


atacgttgta 


11640 


tctatatcat 


aatatgtaca 


tttatattgg 


ctcatgtcca 


atatgaccgc 


cat 


11693 



<210> 94 

<211> 4825 
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<400> 94 



gacggatcgg gagatctccc 


gatcccctat 


ggtgcactct 


cagtacaatc 


tgctctgatg 


60 


ccgcatagtt 


aagccagtat 


ctgctccctg 


cttgtgtgtt 


ggaggtcgct 


gagtagtgcg 


120 


cgagcaaaat 


ttaagctaca 


acaaggcaag 


gcttgaccga 


caattgcatg 


aagaatctgc 


180 


ttagggttag gcgttttgcg 


ctgcttcgcg 


atgtacgggc 


cagatatacg 


cgttgacatt 


240 


gattattgac 


tagttattaa 


tagtaatcaa 


ttacggggtc 


attagttcat 


agcccatata 


300 


tggagttccg 


cgttacataa 


cttacggtaa 


atggcccgcc 


tggctgaccg cccaacgacc 


360 


cccgcccatt 


gacgtcaata 


atgacgtatg 


ttcccatagt 


aacgccaata 


gggactttcc 


420 


attgacgtca 


atgggtggag 


tatttacggt 


aaactgccca 


cttggcagta 


catcaagtgt 


480 


atcatatgcc 


aagtacgccc 


Gctattgacg 


tcaatgacgg 


taaatggccG gcctggcatt 


540 
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atgcccagta 


catgacctta 


tgggactttc ctacttggca 


gtacatctac 


gtattagtca 


600 


tcgctattac 


catggtgatg 


cggttttggc agtacatcaa 


tgggcgtgga 


tagcggtttg 


660 


actcacgggg 


atttccaagt 


ctccacccca ttgacgtcaa 


tgggagtttg 


ttttggcacc 


720 


aaaatcaacg 


ggactttcca 


aaatgtcgta acaactccgc 


cccattgacg 


caaatgggcg 


780 


gtaggcgtgt 


acggtgggag gtctatataa gcagagctct 


ctggctaact 


aagctttcgg 


840 


cgcgccgagg 


taccatggga 


tccgaagacg ccaaaaacat 


aaagaaaggc 


ccggcgccat 


900 


tctatcctct 


agaggatgga 


accgctggag agcaactgca 


taaggctatg 


aagagatacg 


960 


ccctggttcc 


tggaacaatt 


gcttttacag atgcacatat 


cgaggtgaac 


atcacgtacg 


1020 


cggaatactt 


cgaaatgtcc 


gttcggttgg cagaagctat 


gaaacgatat 


gggctgaata 


1080 


caaatcacag aatcgtcgta 


tgcagtgaaa actctcttca 


attctttatg 


ccqgtqttqq 


1140 


gcgcgttatt 


tatcggagtt 


gcagttgcgc ccgcgaacga 


catttataat 


gaacgtgaat 


1200 


tgctcaacag 


tatgaacatt 


tcgcagccta ccgtagtgtt 


tgtttccaaa 


aaggggttgc 


1260 


aaaaaatttt 


gaacgtgcaa 


aaaaaattac caataatcca 


gaaaattatt 


atcatggatt 


1320 


ctaaaacgga 


ttaccaggga 


tttcagtcga tgtacacgtt 


cgtcacatct 


catctacctc 


1380 


ccggttttaa 


tgaatacgat 


tttgtaccag agtcctttga 


tcgtgacaaa 


acaattgcac 


1440 


tgataatgaa ttcctctgga tctactgggt tacctaaggg tgtggccctt 


ccgcatagaa 


1500 


ctgcctgcgt 


cagattctcg catgccagag atcctatttt 


tggcaatcaa 


atcattccgg 


1560 


atactgcgat 


tttaagtgtt 


gttccattcc atcacggttt 


tggaatgttt 


actacactcg 


1620 


gatatttgat 


atgtggattt 


cgagtcgtct taatgtatag 


atttgaagaa 


gagctgtttt 


1680 


tacgatccct 


tcaggattac 


aaaattcaaa gtgcgttgct 


agtaccaacc 


ctattttcat 


1740 


tcttcgccaa 


aagcactctg 


attgacaaat acgatttatc 


taatttacac 


gaaattgctt 


1800 


ctgggggcgc 


acctctttcg 


aaagaagtcg gggaagcggt 


tgcaaaacgc 


ttccatcttc 


1860 


cagggatacg 


acaaggatat 


gggctcactg agactacatc 


agctattctg 


attacacccg 


1920 


agggggatga 


taaaccgggc 


gcggtcggta aagttgttcc 


attttttgaa 


qcqaaqqttq 


1980 


tggatctgga 


taccgggaaa 


acgctgggcg ttaatcagag 


aggcgaatta 


tgtgtcaqaq 


2040 


gacctatgat 


tatgtccggt 


tatgtaaaca atccggaagc 


gaccaacgcc 


ttgattgaca 


2100 


aggatggatg 


gctacattct 


ggagacatag cttactggga 


cgaagacgaa 


cacttcttca 


2160 


tagttgaccg 


cttgaagtct 


ttaattaaat acaaaggata 


tcaggtggcc 


cccgctgaat 


2220 


tggaatcgat 


attgttacaa 


caccccaaca tcttcgacgc 


gggcgtggca 


ggtcttcccg 


2280 


acgatgacgc 


cggtgaactt 


cccgccgccg ttgttgtttt 


ggagcacgga 


aagacgatga 


2340 


cggaaaaaga 


gatcgtggat 


tacgtcgcca gtcaagtaac 


aaccgcgaaa 


aagttgcgcg 


2400 
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gaggagttgt 


gtttgtggac 


gaagtaccga 


aaggtcttac 


cggaaaactc 


gacgcaagaa 


2460 


aaatcagaga 


gatcctcata 


aaggccaaga 


agggcggaaa 


gtccaaattg 


cgcggccgct 


2520 


aactcgagaa 


taaaatgagg 


aaattgcatc 


gcattgtctg 


agtaggtgtc 


attctattct 


2580 


ggggggtggg 


gtggggcagg 


acagcaaggg 


ggaggattgg 


gaagacaata 


gcaggcatgc 


2640 


tggggafcgcg 


gtgggctcta 


tggcttctga 


ggcggaaaga 


accagctggg 


gctctagggg 


2700 


gtatccccac 


gcgccctgta 


gcggcgcatt 


aagcgcggcg 


ggtgtggtgg 


ttacgcgcag 


2760 


cgtgaccgct 


acacttgcca 


gcgccctagc 


gcccgctcct 


ttcgctttct 


tcccttcctt 


2820 


tctcgccacg 


ttcgccggct 


ttccccgtca 


agctctaaat 


cgggggtccc 


tttagggttc 


2880 


cgatttagtg 


ctttacggca 


cctcgacccc 


aaaaaacttg 


attagggtga 


tggttcacgt 


2940 


acctagaagt 


tcctattccg 


aagttcctat 


tctctagaaa 


gtataggaac 


ttccttggcc 


3000 


aaaaagcctg 


aactcaccgc 


gacgtctgtc 


gagaagtttc 


tgatcgaaaa 


gttcgacagc 


3060 


gtctccgacc 


tgatgcagct 


ctcggagggc 


gaagaatctc 


gtgctttcag 


cttcgatgta 


3120 


ggagggcgtg 


gatatgtcct 


gcgggtaaat 


agctgcgccg 


atggtttcta 


caaagatcgt 


3180 


tatgtttatc 


ggcactttgc 


atcggccgcg 


ctcccgattc 


cggaagtgct 


tgacattggg 


3240 


gaattcagcg 


agagcctgac 


ctattgcatc 


tcccgccgtg 


cacagggtgt 


cacgttgcaa 


3300 


gacctgcctg 


aaaccgaact 


gcccgctgtt 


ctgcagccgg 


tcgcggaggc 


catggatgcg 


3360 


atcgctgcgg 


ccgatcttag 


ccagacgagc 


gggttcggcc 


cattcggacc 


gcaaggaatc 


3420 


ggtcaataca 


ctacatggcg 


tgatttcata 


tgcgcgattg 


ctgatcccca 


tgtgtatcac 


3480 


tggcaaactg 


tgatggacga 


caccgtcagt 


gcgtccgtcg 


cgcaggctct 


cgatgagctg 


3540 


atgctttggg 


ccgaggactg 


ccccgaagtc 


cggcacctcg 


tgcagcaaac 


aaaccaccgc 


3600 


tggtagcggt 


ttttttgttt 


gcaagcagca 


gattacgcgc 


agaaaaaaag 


gatctcaaga 


3660 


agatcctttg 


atcttttcta 


cggggtctga 


cgctcagtgg 


aacgaaaact 


cacgttaagg 


3720 


gattttggtc 


atgagattat 


caaaaaggat 


cttcacctag 


atccttttaa 


attaaaaatg 


3780 


aagttttaaa 


tcaatctaaa 


gtatatatga 


gtaaacttgg 


tctgacagtt 


accaatgctt 


3840 


aatcagtgag 


gcacctatct 


cagcgatctg 


tctatttcgt 


tcatccatag 


ttgcctgact 


3900 


ccccgtcgtg 


tagataacta 


cgatacggga 


gggcttacca 


tctggcccca 


gtgctgcaat 


3960 


gataccgcga 


gacccacgct 


caccggctcc 


agatttatca 


gcaataaacc 


agccagccgg 


4020 


aagggccgag 


cgcagaagtg 


gtcctgcaac 


tttatccgcc 


tccatccagt 


ctattaattg 


4080 


ttgccgggaa 


gctagagtaa 


gtagttcgcc 


agttaatagt 


ttgcgcaacg 


ttgttgccat 


4140 


fcgctacaggc 


atcgtggtgt 


cacgctcgtc 


gtttggtatg 


gcttcattca 


gctccggttc 


4200 
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ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt 4260 

cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc 4320 

agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga 4380 

gtactcaacc aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc 4440 

gtcaatacgg gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa 4500 

acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta 4560 

acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg 4620 

agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg 4680 

aatactcata ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat 4740 

gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt 4800 

tccccgaaaa gtgccacctg acgtc 4825 



59 



